
Yeah, so when we do that, we actually should see that start here soon. Yep, so there's a delay, but that's a problem. Well, Danielle, I'm trying to find... Is it going to be recorded? Yeah, multiple records. So we're going to record here. The camera's going to be recording right there. And I'm going to have audio over there recording. So the backup rescue is me spending hours in Final Cut Pro to make sure that your video goes live. But hopefully this will work. You know, live is always... So... So you should be able to hear this and that goes in fine. Good.
Live from the Collider in downtown Asheville, North Carolina, this is the 5th annual B-Sides Asheville live stream. We'll be getting started shortly with our first presentation from Dan Burrell with Recorded Future on leveraging automation for threat intelligence at scale. Throughout the morning, you will also hear from Tim Hopper with silence on the challenges in applying machine learning to cybersecurity. Eric Krohn with know-before on stopping the explosion of ransomware. Nathan Donovan with the Data Machines Corp on categorical correlations as probabilistic rules. Then when we return from lunch at 1.10 p.m., you'll hear from Jason Gillum with secure ideas on folding steel and Samurai WTF Reboot, then Alejandro Caceres with Hyperion Grey on Breaking Everything, Adam Mathis
with Red Canary on testing endpoint security solutions with Atomic Red Team, and finally our closing keynote presentation at 4.10pm with Chris Nickerson, the CEO of Layers Consulting. So sit back, get a cup of coffee or tea depending on your preference for we will be getting started soon. Live from the Collider in downtown Asheville, North Carolina, this is the fifth annual B-Sides Asheville live stream. We'll be getting started shortly with our first presentation from Dan Burrell with Recorded Future on leveraging automation for threat intelligence at scale. Throughout the morning, you will also hear from Tim Hopper with silence on the challenges in applying machine learning to cybersecurity. Eric Krohn with know-before on stopping the explosion of ransomware. Nathan Donovan with the Data Machines Corp
on categorical correlations as probabilistic rules. Then when we return from lunch at 1.10 p.m., you'll hear from Jason Gillum with secure ideas on folding steel and Samurai WTF Reboot. Then Alejandro Caceres with Hyperion Grey on Breaking Everything. Adam Mathis with Red Canary on testing endpoint security solutions. Up next, leveraging automation for threat intelligence at scale with Dan Burrell with Recorded Future at 8am.
Up next, leveraging automation for threat intelligence at scale with Dan Burrell with Recorded Future at 8 a.m. I think you're golden. Excellent. And you're live. And I'm live. So good morning. Welcome to B-Sides Asheville. I have the distinct honor of being your first speaker today, which is great because I can with confidence say this is the best presentation you will have seen so far today. I'm going to be talking about, well, the title of my presentation is Machine to Machine, Automating Threat Intelligence. So just kind of general overview. I wanted to talk a little bit about what threat intelligence is and then how it's you can use automation to empower your team to make faster, better decisions in your
security operations. It's a little bit about me. I'm a threat intelligence consultant with Recorded Future. I'm also the manager for the automation and advisor group there. I've been working in the private sector for a little over five years. Some experience in the financial sector, mostly vulnerability management before I came to Recorded Future. Also a U.S. Army veteran with over 13 years of experience with the military. There's a picture of me there in front of some big machines because I love machines. Machines make our lives easier in so many different ways. In this case, it let us dig up explosive devices and disable those without getting any of our soldiers hurt. Technology that we'll be talking about today will help your teams as I said,
be stronger, better, faster when it comes to identifying and engaging threats in your environment. So here's a good question. How much time do your analysts spend processing threat intelligence versus analyzing it? A lot. I think I've seen some metrics that show it's an 80/20 split. You're spending 80% of your time just collecting and managing all of this data and then you've got about 20% left to actually figure out what does it mean.
So automation, this is based on a poll with last year from our customers at Recorded Future, and just looking at how automation has improved their programs. Some of the reported metrics were 80% reduction in time spent on data management, a 50% reduction in time it spent to prepare reports, and a 10 times decrease in the speed analysis during critical incidents. So we see reporting from our customers that when they deploy these technologies and they can move the data around and they can process the data in an automated fashion, it saves their analysts a lot of time. And so their analysts can spend a lot more time really figuring out what the threats to your organization are and how to deal with those. It should
start by defining what is threat intelligence. There's a lot of marketing out there on threat intelligence. I'll put air quotes around that. I will say hint, it's not a pew pew map. You see a lot of this stuff in marketing. You see these little lines like, I don't know, Russia's attacking the United States and I don't know, we're attacking Alaska. It's, you know, it's. - I think they're all bidirectional. - Yeah. But what does this tell, this doesn't tell me anything. I don't care if there's some attack vector from one country to another. What I care about, how do I take action on this? There's nothing I can do with this. Tell me something, you know, who's specifically attacking who? What is the infrastructure they're using? What are the
actual malware signatures they're using? You know, what are the things that I can actually, tangible things that I can look for and act on in my environment? So that's what threat intelligence is. It's not just information, but it's the context that makes that information useful to your organization. So Gartner has a great definition, a little wordy, "Threat intelligence is evidence-based knowledge, including context, mechanisms, indicators, implications, and actionable advice about an existing or emerging menace or hazard to assets that can be used to inform decisions regarding the subject's response to that menace or hazard." So again, a little wordy, pretty comprehensive, but some key things there, we see, you know, actionable, it's something that you can take action on, includes context, and
in some cases even include some type of advice on what to do. That's my definition, a little more concise, but kind of the same thing. Threat intelligence is actionable information regarding current or anticipated threats that influence decisions within an organization. So when we're looking at threat intelligence, when we're evaluating whether a particular source of threat intelligence is valuable to our organization, we want to make sure that the information provided is actionable and relevant to threats in our environment and also something that we can use to make decisions. So most of us work for companies that spend money and they want to know what their ROI is on their security program and sometimes it's hard to define that. So as
a business justification you can take this back, threat intelligence facilitates the identification and reduction of risk through a better understanding of your operational environment. So without threat intelligence you're essentially operating blind. You don't know without, you probably know what's going on inside your network, but without understanding what the external threats are and what that threat landscape is, you have no context to figure out what's a threat and what's not a threat within your environment, or how to prepare for anticipated threats. To paraphrase Sun Tzu, if you don't know yourself and you don't know your enemy, you have to have both of those things, you have to know yourself and your enemy, you'll lose almost every time.
But you know both of those things and you set yourself up for success. I love threat intel. We have a saying in the army that enthusiasm will take you very far. So threat intelligence, it should permeate all aspects of your security program. It's not a siloed function that's just talking to itself and like, "Hey, these are really cool. Look at all this neat stuff. This threat actor is really interesting." It should not be peripheral to your organization. And again, I'll kind of go back to how we do intelligence in the Army. In any command structure, your S2, your intelligence staff, are key advisors to the commander. And no decisions get made within a military unit without consulting your intelligence team. Because you have
to understand the impacts of the enemy on your actions and how your actions might impact the enemy. And that's what your intelligence team tells you.
So at Recorded Future, we deal with a lot of different teams at different maturity levels. So there's some qualities that we see in mature teams. Mature teams have clearly defined intelligence goals. It means they know what they want to look for and they know how to get it. They utilize deliberate and repeatable processes. So they're not operating ad hoc, they're not flying by the seat of their pants, they have regular deliverables that they're creating in repeatable ways to build robust intelligence products inside your organization. And they're also trusted influencers. The other aspects of your security team trust them, your board trusts them. When they say this is a threat, people take that seriously. and that
influences their actions within the organization. Like Archer. It's not the size of your data that matters, it's how you use it. So you can have all of the threat intelligence in the world, but without a mature team with mature processes and practices, it's essentially useless to you. And I'll harp on this again. High quality feeds, they have to include context. They have to have something that tells you definitively, what can I do with this? How does this impact me? This little picture here, orange and white Volkswagen Passat. Let me tell you a little war story. So I was a platoon leader doing route clearance operations in Mosul, Iraq, and we were out every day driving around looking for threats to
our troops. And one morning I'm prepping my platoon to roll out and we get this breathless intel guy runs up to the motor pool with this report. We have a reliable report of an imminent attack, vehicle-borne IED, today. The vehicle they're using is an orange and white Volkswagen Passat. So be on the lookout for an orange and white Volkswagen Passat. Do we have any other veterans here that have been to Iraq before? What do the taxis look like? That's a taxi. So we roll out of the gate, and there's literally hundreds of these things. Yeah, I can see we're right by the egg right there. I'm like, there's two right there. So we have what might be good intelligence. It might be a good source.
There might actually be an imminent attack. But the indicator that I've been given isn't useful to me. So you need intelligence that is specific enough that you can find it and definitively know this is a threat and deal with it. Give me a license plate number. Give me a last seen location. Something that helps me figure out what I'm looking for. Okay, that's my soapbox for threat intelligence. I'll get into the automation piece here. So what is automation? We've got some cool little robots there. Most of the times it's not robots. A lot of times it's just software. So automation is anything that takes the workload off of your analysts. When you automate, you're putting
systems in place that do the boring stuff. There's a book, Automate the Boring Stuff, that eats up your analysts' time. And I'll talk a little bit about the key areas that you can automate in your threat intelligence program. There's a lot of pieces to this. It can be scripts, it can be appliances, API communication, machines talking to machines. Automation does not replace your analysts. You still need analysts. But what it does do is it makes your analysts, as I said, better, stronger, faster. Think of your analysts as cyborgs or as our Swedish friends in the future like to say, threat intelligence centaurs. because they have the speed of a horse but the brain of a human. So common candidates for automation. So these
are some areas where organizations can really get a lot of ROI on automation efforts. Data management, data enrichment and correlation, rule and alert generation, and threat research. These are all areas that our customers at Recruit Future are the ones that are successful with automation. This is where they're spending their efforts. So for data management, so think of data management any time you're moving data around, you're copying data, you're transforming data, you're doing something to data. And automation in this case is, it's important because the scale of these data sets that we work with are enormous. I mean, you might have a list of 100,000 IP addresses that are known tied to command and control infrastructure. You might have millions, literally millions of data
points that you're pulling into your program to try to make sense of the threats. You could, I guess, if you had enough money. If you had infinite money, you could just hire enough people to do that. But to really keep up with the pace of the threat landscape, you need something that can manage all of that data in an automated fashion. Data entry. Your team should not be doing data entry unless you're literally copying from handwritten notes There's no reason for a human to sit there and type data into a system Anywhere you can avoid it Because they have better things to do. There's a little diagram How threat intelligence gets data is often managed. We have our threat intelligence sources kind of out
in the world and Most organizations will pull that in and they'll store it somewhere. This isn't exact. Sometimes your SIM is the one that's pulling the data and it's storing it in the SIM. Some organizations have a TIP and they're pulling their data into the TIP. A TIP is a threat intelligence platform, if you haven't seen that. So they're storing the data there. I have some customers that just roll their own threat database and they are just using Mongo or something to just collect all the information in a central place. And then that data, you can share it out via API calls to all of your appliances. Push it, pull it from your SIM, your
orchestration and incident response tools, and your vulnerability scanners. Data enrichment and correlation. A couple definitions here. Enrichment, in this case, is adding context to internal data. So you have some data from inside your environment, and you're trying to make sense of it, so you enrich it with threat intelligence. Correlation is surfacing interesting things in your environment. A lot of times in this case, you're starting with some type of external data, a threat feed, and you're saying, "Are any of these things in my environment?" So let's see if we have indicators of threat based on what we know from threat intelligence feeds. And your computers are always going to be faster at this than your analysts.
It's just, it's infeasible to have an analyst look through data from a threat feed and ask the question, "Is this in my environment?" and then go manually look for it. There's just too much data. Even if you have a very narrow threat intelligence feed, say you've only got a few hundred indicators, you've probably got millions of events in your organization. SIMs make that easier to search, but it's still kind of an insurmountable problem. So this is kind of what data enrichment looks like. You've got some type of internal indicator. It could be a wide variety of things, an IP address, a hash, a URL. a file name, something, and you say, "What is this? "This
looks suspicious, what is it?" So you have some type of automated process that does a data lookup, either from your own internal store of threat intelligence, or it looks to external feeds, and then it brings back several pieces of information and attaches to that indicator. Might be some type of risk or confidence score that we're 50% confident that this is an indicator of risk. Could be some risk evidence. That could be things like this IP address has been observed as part of command and control for Zeus or some other malware and related entities. So you might say this IP address has been associated with these domains and these hashes, et cetera. And this is really useful from a research incident response standpoint, especially when this is automated, because it
pulls all this information in and attaches it to the artifact you have in your environment. And then it gives you other things to look for. So you could say this IP address, we see suspicious outbound traffic to this IP address, we do our data enrichment, well this has been associated with these hashes. Other sightings of this IP address have been seen with these hashes. So then you say, well do these hashes exist in our environment? So you pivot from there and then you can confirm or deny whether or not that one piece of information, that data point, indicates something larger in your environment. And that can happen very quickly with automation. data correlation. This is typically done more at bulk. So you're using two large sets
of data. You start with some type of threat intelligence feed, an external feed that you've got. You have internal, say, log data, and then you do an intersection on that, and you say, what do these two lists have in common? And that resulting sub-list is your correlated data. So say I have a list of URLs that have been observed in phishing campaigns in the last 48 hours, and I have logs from my web proxy. These are all the URLs that have been requested through our web proxy. Intersect those two sets and say, "Well, we've got these machines that are calling out to URLs that have been observed and involved in phishing campaigns." In automation, using automation, this happens in seconds. So you can very quickly figure out
if you have a threat in your environment.
So rule and alert generation. This topic can be a little more contentious. I know some people are of the opinion that your rules should be manually created, that your team should be building the rules themselves based on their understanding. But in my opinion, you can't keep up with the pace of the threats, especially with IP addresses. IP infrastructure. as an attack infrastructure, the value of that data is hours to days. Because an IP address is risky right now, doesn't mean it's risky tomorrow. And when you're dealing with threat lists with thousands, tens of thousands, hundreds of thousands of IP addresses, there's no way your team can build rules to detect those in a manual fashion. You have to build automated systems that will automatically generate the rules
to detect those in your environment. You should also tune these based on what's important to you. It's, you know, again, it says there don't blindly block things. That's important. Don't blindly block things. A lot of security teams get yelled at because they got a list of IP addresses associated with some botnet and then they just blocked the whole list and then they are impacting their business. So you have to, you have to build in some logic into this rule generation that That makes sense. A good way to do that is white listing. We do that at Recorded Future with our risk scoring. We get indicators that IP addresses have been associated with malware, but half of these are AWS IP addresses or Cloudflare IP addresses. So we mitigate that.
DNS servers, things like that. So as you're, so you build it so it's smart enough to say, I've got some threat intelligence, build some rules, caveat, build them based on these requirements, these business requirements. And then finally threat research. This one's a little softer. It's more human involved doing threat research. But what automation does is it allows you to process large collections of data to surface interesting things that could potentially help inform your security program. As in automation plus large sets of data equals awesome threat hunting. I'm gonna talk a little bit about some, this is actually something I did recently at Recorded Future. So one of the threat feeds that we, or one of the data types that we collect on is
our vulnerabilities. And we build risk scoring around vulnerabilities to say whether we consider them not a risk, or low criticality all the way up to very critical. And so when we make that data available to customers and we have what we call threat lists, we have a vulnerability threat list that contains every vulnerability that we consider a critical vulnerability, saying you should patch this. So this is the breakdown by CVSS score of those vulnerabilities. See we've got 36% are critical. We've got 62%, 62.40% are high. And then medium-low or no CVSS score is about 1% of the entire data set that we would consider critical. So there's medium and lower, small population. Doesn't look like that much of a threat. Maybe you're like, well, we should definitely
prioritize these highs because a lot of them are being exploited or we have, you know, there's a lot of them in the set. But I was curious, if we take that set of critical vulnerabilities and we process it to say, let's only look at the ones that have current evidence of exploitation. Let's take the full set of vulnerabilities that we're getting and say, I'm only interested in the ones that have been observed being exploited by malware in, say, the last week. Now we see a very different breakdown. We see critical state about the same percentage, But you'll notice medium, low, and none, none meaning there's no CVSS score assigned to that vulnerability, is now over 25% of the pie. So what does this tell
me as a threat intelligence analyst or vulnerability management person at my company? It says, hey, we should pay attention to these medium and low vulnerabilities because they're being exploited. It gives me the context to say, let's not just blindly patch. Again, this goes back to knowing your enemy. Let's not just start with the criticals and work down until we get to the lows, because you'll never get to the lows. The volume of vulnerabilities that are being disclosed relevant to your environment, realistically, you're probably never going to patch all of them. So you have to figure out which ones are most important. And I've worked for organizations that they've got a 60-day patch process, right? How often does Microsoft release vulnerabilities? Every 30 days.
So all they did was constantly fall behind because they were patching slower than the vulnerabilities were being disclosed. So this, again, this shows you how you can use that context and how you can use automation to surface that context And what I did here is I just basically built a series of Python scripts that took the raw data, compared it to some other sources, and surfaced this. I didn't have to go through and look at each, this dataset, the original list is like 38,000 vulnerabilities. So there's no way that I'm gonna sit there and go through 38,000 vulnerabilities and see which ones are being actively exploited.
So integrations. So it's all kind of good in theory, but how does this look in your environment? How do you deploy these solutions with the technology that you're already using? So these are three things that customers are commonly doing. They're correlating in their sim to build targeted alerting within their environment. They're enriching artifacts in their incident response tools or orchestration tools for incident investigations. And they're correlating and enriching their vulnerability data that's coming out of their vulnerability scanners. So this is kind of what it looks like in your sim. You've got your threat feeds. You've got your log data. Same pattern we saw before with the-- we're looking at the intersection of the threat feeds
and your logs. We've got correlated data. But then from here, we're taking that and we're applying some alert criteria. We might say we've got an intersection of these IP addresses from our outbound outbound firewall logs and known command and control. Let's apply some logic to say I'm interested in anything that has a confidence score over 75, so we're at least 75% confident that this is actually a server or an IP address that's been abused. Then we'll apply some whitelists. We're gonna ignore public DNS servers. We're gonna ignore AWS and CloudFlare and other CDNs. And then we get a much smaller set of IP addresses that we'll start alerting on in our environment. and have that done automatically. The way this is done, a lot of our, with our
customers when they're pulling our data, they'll do this kind of at various times. So for IP addresses, we recommend every hour to update your alerts based on the latest threat intelligence. Hashes and vulnerabilities, you can probably get away with doing that once a day. Domains and URLs, maybe four or five times a day. So that's happening constantly. So as those processes are running, your monitoring tools are automatically detecting new threats without the need of human interaction. What does that do? That frees your analysts up to actually respond to the alerts that are coming up. They're not building rules, they're responding to incidents in your environment. So for incident response orchestration tools, this is a common pattern that we
see with integrations. You have an incident, something's happened, A lot of times you don't know if it's a security incident yet. It's just a thing. There's some alert triggered and we have some indicators or some artifacts, but we don't know what it is. We're trying to make sense of it. So we'll take this and say we've got an IP address, a domain, a URL, and we'll enrich each of those. The IP address, maybe we'll look at the ASN and say, what neighborhood is this IP address in? Are there a lot of other risky IP addresses to the left and right? We'll look at is this IP address on any command and control block lists? Has it been reported, observed, supporting any botnets? For
the domain, we'll enrich that. We'll probably pull in some whois data. Who owns this? Who registered it? When was it? How old is it? Might look at subdomains. Great indicator here. You've got a domain with a lot of observed high entropy subdomains. That's an indicator that there's something really fishy going on. And also there are threat lists, block lists for domains that you can look at. And so that gives additional context. URLs, a lot of times this is mostly seen either with phishing campaigns, so URLs that are being linked out from emails or malware droppers. So those are two common areas that we see URLs being deployed. So again, the enrichment automation can say, This URL has been observed in these fish, reported, you know,
as in these phishing campaigns. Or this URL has been reported as dropping these malwares. And then the last piece here, vulnerability correlation and enrichment. So this, the way this normally works, so most vulnerability scanners, most commercial vulnerability scanners don't support pushing data into the scanner, right? They don't support the use case of doing enrichment and correlation in the scanner. So the way most of our customers are doing this, and the way I see this most often, is you're taking scanned reports. So your vulnerability scanners are constantly running. They're generating reports daily, weekly, whatever your cadence is. You take those reports. You either just work with them on a file system, or it gets pushed into something like Splunk, where you can search through it like
you would any other piece of log data. And then once you have it out of the vulnerability scanner, then you can do bulk enrichment. We can see here this little spreadsheet. This is actually intelligence from Recorded Futures. So we've got a list of vulnerabilities that have come out of a scan. We've got a risk score. So we're saying, hey, these are highly critical vulnerabilities according to the intelligence we have. And then we include risk evidence on that. So what that tells your vulnerability management, patch management team, is this vulnerability is risky, not just because it has a high CVSS score, but because of the evidence that's provided by our threat intelligence providers. One of the challenges with this is that a lot of
vulnerability scanners don't use CVEs in their reports. They'll use their own proprietary vulnerability IDs. So sometimes you have to do some type of data mapping. Say, if you're using Qualys, for example, they assign a QID to vulnerabilities. So you have to map, say, we've got these QIDs in our environment, so map each of those to the CVEs that are related to that QID, sometimes it can be multiple CVEs, and then from that you do the bulk enrichment. You can also do correlation with vulnerability data. This is enrichment, so we're attaching additional information to the vulnerabilities from our environment. You could just as easily though say we've got a list of, going back to the slide I had before, say let's take those
medium and lower vulnerabilities that are being exploited, correlate those with our scan data. See how many of those vulnerabilities are in our environment. Just because it's out there doesn't mean you're using the technology. So you have to, again, you have to understand yourself in order to react to the threat. One more piece, this is the last piece I've got. Learn to code. I'm not saying you have to learn to code, but someone on your team should know how to code. I recommend Python. One, because Python's easy. Two, because almost every security tool I've worked with is either written in Python or supports Python. It's just, it's kind of the language of the industry. You don't have to use it. You could use Ruby. You could use C#.net. You could
use Java. I mean, if you really wanted to, you could use C. But knowing how to code and having that capability on your team will save you so much time. Say for example you get some new threat feed, some new data that you think is valuable, but it's in a non-standard format. So it's in a format that your tools are not equipped to consume. If you have someone that knows how to do this, in an afternoon they can knock out a script that'll do the transformations that you need to consume that data into your feed. threat research. If you know someone that knows how to code or if you have some you can code or someone on your team can do it, they can knock
out in an hour or two scripts that will answer the questions you have about the data you're looking at. So have this capability on your team and I say in your team because if you're if you want to rely on the expertise in other places in your organization you may or may not get those resources. This comic is obviously dated That's Python 2? Everyone should be on Python. Everyone's using Python 3, right? All right. Questions? Awesome talk. I had two questions, and feel free to ignore both or either. Okay. Go a little bit into more detail about automating the tuning of rules for a particular organization. What kind of information about your organization or the
inbound thread? Yeah. Remember to repeat. Yeah, so, okay, yeah. So the question was, I guess I could say, so what are some best practices around tuning alerts in an automated fashion for your organization? So the classic answer is it depends on your organization. What we see a lot of times is customers will look at, they'll look at the threat landscape to their industry vertical or to companies similar size to themselves, and they'll say, Our industry peers are being hit a lot with these types of campaigns. It's whale fishing campaigns, it's ransomware attacks, whatever it is. And so you take that and you tune your alerts to look for those very specific threats in your environment. And that's one thing. The
other thing is obviously white listing. So if you have one thing that's good to do Figure out what are the IP addresses of all your key business partners. Make sure you have a white list of those IP addresses so you don't inadvertently block some critical connection in your environment. And the other thing too, so for your kind of teams that have more advanced threat hunting capabilities, you may start building out like TTP profiles, kind of attack patterns, and there are ways to build your alerts around those attack patterns. Again, to look for very specific things in your environment. And I'm an advocate of, you can have an alert that says, tell me any bad IP, and also have an
alert that says, tell me specifically if I see an IP address related to this botnet because company X and Y have been hit by this, and they're very similar to me. What was your second question? Yeah. So... A lot of this is being done actually in their tools. So they're doing this in Splunk, and they're doing this in QRadar, or they're doing this in Phantom or Resilient or some system like that. So a lot of those systems have kind of some infrastructure built into them. So again, it depends on the amount of data that you're working with. Don't know any kind of benchmarks off the top of my head for how you do that. It would just be, you might, I don't know, you
would have to... Yeah, system dependent. Any other questions? Start here and then we'll come over there. - Favorite sources of threat intel, sources of threat intel you don't trust. - Favorite sources of threat intel and sources of threat intel I don't trust. Well, I work for a threat intelligence company, so I like our data a lot. It's one of the reasons why I came to work for Recorded Future, was because I was really impressed with the product. So good sources of intel, there's a lot of feeds out there that are valuable. A lot of them are, there's organizations that all they do is they collect indicators around specific campaigns. There's organizations that will provide you indicators for Zeus, I keep
mentioning Zeus, but Fyodor or specific ransomware. So finding those feeds that are relevant to the threats to your organization is pretty important. Feeds they don't trust. Anything that doesn't have context. If all you're giving me is a list of indicators, but you're not telling me when you saw it, you're not telling me how many times you've seen it, you're not telling me what other things you saw with it, it's essentially useless. So again, I would evaluate, if you're evaluating threat intelligence feeds, evaluate it both on does it provide intelligence on threats that are relevant to you, and does it provide you enough context to actually act on the intelligence? You have a question over here? Do you have any quick go-to cheat sheet resources of things that you should
be logging on, like endpoints and network-based devices? Because a lot of times when I'm doing pen testing or red-teaming, one of the hardest things I come up against is the problem around the-- I don't have a cheat sheet. Okay, the question was, do you have a cheat sheet for best places to collect log data for, like in your sim? I don't have a cheat sheet like per se. There are, I've seen some pitfalls where customers are kind of shooting themselves in the foot with what they're collecting. One thing I would say not to collect or not to care too much about are dropped connections from outbound traffic or inbound traffic. So if you have, if your firewall is dropping inbound connections, you might want to know about it from
like, there might be a DDoS going on, but those shouldn't be security incidents. Because inbound traffic was blocked by the firewall. Firewall's doing its job. It's not an incident. But I see a lot of customers who are collecting, they're pulling those dropped connection events from their firewalls, and they're just being flooded with alerts. We had 2,000 attempts to access our network from blocked IPs today, and I, as an analyst, now have a queue of 2,000 alerts to go through. So they've... So you want to be selective to pull from the areas that are most likely going to represent threats. So outbound IPs, outbound URLs from your firewalls or proxies or next-gen firewalls, that tends to be
high fidelity because it's something inside your environment that's talking out. So if it's talking out to a malicious IP address or malicious URL, that could be bad. It warrants an investigation. The question was if you're at a new organization and you're just standing up your sim, what are the top things you should start looking for at first? Again, I would go back to that outbound traffic. is important. If you can get anything related to, again, your endpoints protection will probably pick this up, but we've got customers that are actually pulling data, they're pulling log data from their antivirus into their SIM so that they have those hashes and other signatures in their SIM that they can alert there on. There's, you
know, 'cause it could be, you know, it could be that The endpoint protection did its job right found something cleaned it up, but you want to know you want to make sure that you capture that event in a centralized Location and I would just make sure that you're you know you want to make sure you're collecting from the you definitely want you want your firewalls you want your proxies you want to collect from You know any any load balancers actually DNS a lot of people aren't watching their DNS servers and then that's a problem because DNS tunneling is a very common technique that threat actors use. That again goes back to the comment about high entropy subdomains. If you're seeing a bunch
of really high entropy subdomains in outbound traffic from your organization, even if those aren't considered malicious, that's an indicator that someone's tunneling information out of your network. So you definitely want to collect DNS. So I'd say firewalls, proxies, DNS, maybe some other kind of network infrastructure pieces that might capture some of that data. Any other questions? All right. Nothing. Oh, one. Yeah. Okay. So you mentioned who is. I mean, that is useful. So we kind of have a problem now with the European relations have forced so... What's going to replace that indicator doing that sort of research, or are we just waiting until the ICANN works out whatever sharing program they want to create, sometimes five
years from now? Yeah, so the loss of who is is problematic, could be problematic. I'll caveat that to say most thread actors aren't registering domains under their own name. They're using third-party... They use the same fake information. Or what I see most often is they'll use some third-party anonymizing registrar out of Eastern Europe that registers on their behalf, and now you've got a bunch of domains that are just registered by some shell company somewhere. So it is a degradation in the information we can get around those sources, but I don't think that it impacts... I don't think it impacts us that much. Not as much as some people are concerned about. There are other indicators, non-PII related indicators
around domain registrations that are still useful. You still have the time, you still know when it was registered. If it was registered yesterday and it's a typosquat of one of your domains, That's a concern. It was registered yesterday and it's one letter off from my domain and now I'm seeing traffic in my environment. That immediately tells me that this is something that I need to look at. Anything else? All right, thank you so much. I'm John. I'm one of the organizers. So this year, we really did well this year. We sold 190 tickets and we got great sponsors. So this year, we decided to do some real good speaker prizes. And so Dan here is getting some of the first
ones. - All right. - So that we've had this last year. - All right, that's going on my laptop. - But we also this year for, we did some limited edition coins. The coin is says on the front, think globally secure locally, because if you ever looked under the B-Sides logo, it says local in the binary. And the back of the coin is a dragon and a knight. And this year on the coin it says, in the cyber realm and binary, here be dragons in Latin and defeated by no enemy. So he's getting the first coin. Thank you very much. Since this is Beer City, USA. Local company made these. These are all stamped with
B-Sides logos. Awesome. And also got glasses to go with it with a B-Sides logo. That's great. So Dan is getting the first speaker's gift. Thank you so much. The next talk is supposed to be in like about 10 minutes. Up next, challenges in applying machine learning to cybersecurity with Tim Hopper of Silence at 9 a.m. ♪ music playing ♪
Up next, challenges in applying machine learning to cybersecurity with Tim Hopper of Silence at 9 a.m. Can you just shut that door? Okay. Thank you.
- Alright, I'm Tim Hopper, I'm a data scientist at Silence. I assume most of you have at least heard of Silence. We are a next generation antivirus, anti-malware product. Unfortunately, after I submitted to apply to speak here, this is the first security conference I'm speaking at. Actually, the first security conference I've been at. Thank you. I applied and I was like, "Oh, they probably won't be interested." So I bought tickets to the Mandolin Orange concert in Raleigh tonight. So I'm gonna have to jet pretty soon after I talk, but if you wanna get a hold of me, I'm happy for you to. I'll stay around for maybe an hour or so, but I'm gonna have to go pretty quick. So
my background is in math and operations research, which is kind of applied math. Not in cybersecurity so much. I did a... minor in computer science in college, but I'm relatively new to cybersecurity in my life, but I've been doing it for I guess a little over three years now in the data science space I came into cybersecurity through a data science and machine learning type applications I've worked at KDM Distilled Networks after that which is a malicious web bot mitigation company and then I'm now at Cylance. Love Western North Carolina, grew up visiting my grandparents here and dream of moving here. One day I'm in Raleigh remotely for Cylance and I have pictures of the mountains if you'd like to see that. So as I start
there's a distinction that I have that I think is mostly useless here, which is machine learning versus artificial intelligence. In the last couple years, artificial intelligence has had a resurgence as a term, largely through marketing departments. There's really no... significant distinction to be made in talking about applying artificial intelligence versus machine learning. So my talk is challenges in applying machine learning to cybersecurity. You could replace that with artificial intelligence and it would be exactly the same. Except if you use the term artificial intelligence, people might give you more money, so it's less challenging. But we see these articles coming around And you know as I was looking for articles on the kind of the hype
of AI or machine learning and cybersecurity There are a lot of articles you see also that are I think more reasonable approaches, but you know these these hype articles executives love to post on LinkedIn and say things like the scalability and automation capability machine learning can efficiently analyze large pools of data correlate and simultaneous events and learn to separate atypical from typical behaviors thus enabling it to detect potential threats even at an advanced level and You know the title there being AI is the future of cybersecurity and certainly there there are a lot of great and valuable applications of Machine learning to cybersecurity but it it's not simply just a magic bullet. I mean, this is actually a great follow up to
Dan's talk just now, in thinking about automating threat intelligence Someone could come along and say, "Well, you don't even need to do all this automation effort. "You just need to do machine learning. "You just need to do artificial intelligence "to your threat intelligence work. "You don't have to do all this automation, "but manual automation." That's too much. And the reality is, That's just the manual effort, the human effort isn't just going away overnight. And applying machine learning effectively is an extraordinarily challenging and thus very expensive thing to do. To try to give a little bit of technical context, I'm not going to go in deep into what is machine learning here, but in this talk I'm largely going to be focused
on this concept called supervised learning, which probably many of you have heard that term. So, machine learning can roughly be divided into supervised and unsupervised learning methods. And I'm focusing here on supervised learning methods because if supervised learning methods are hard to apply to cybersecurity, unsupervised methods are even harder. So, you know, the things that I'm suggesting would apply even for unsupervised methods. Roughly supervised learning is methods for algorithmically learning patterns from data Terrible grammar here and making predictions on unlabeled data. So you're trying to learn patterns from data where you have labels of something you're trying to predict that you can then Apply the the patterns you've learned to make predictions on data that doesn't have labels So
you could think in my previous company in malicious web bot detection, we have essentially enriched Nginx logs coming in before customers making HTTP requests, we get to see this HTTP request. And we want to make some prediction is the user or system behind that request malicious or not. So you might go add a bunch of labels, manually label data and learn patterns that you can predict in the future. Or in the anti-malware case, you can collect a bunch of binary files, label them as malicious or not, and then you teach your machine learning model, which is just some sort of mathematical formulation to detect patterns in those files that correlate to malicious or not, and then you can
make predictions in the future. So I'm gonna use the term model pretty liberally. It can mean a number of things, but it's really just a function that we can train on some data to make predictions about something. And training means finding parameters for that function to make specific predictions for your example. So you can think of the the model in the general class that has all these free parameters that you can fit and then you are automatically learning or mathematically updating your parameters based on your data to make predictions in the future. So that's maybe a three minute introduction to machine learning just to, if you don't have context, to give you a little more context for the terminology in the rest of the talk. So these
are some applications I could think of of machine learning in the cybersecurity space. Bot detection, which I was describing. Distilled network provides. I mean, this is... huge space of companies trying to stop malicious hackers from stealing their data or DDoSing or doing account fraud, all kinds of things. It's a big space. It's almost impossible to strictly do manually. If for no other reason it happens in real time you have to be able to respond to your HTTP requests instantly. Intrusion detection Biometric authentication, obviously a huge area these days. Malware detection like Cylance and many others are providing these days using machine learning. Automated penetration testing. And of course one of the most famous examples and
really one of the most successful examples here is spam detection. We all remember our email 15 years ago where we just got tons of spam into our inbox and now if you use a major email provider you almost never see spam anymore. Those are our Machine learning models or probably more realistically a whole combination of models and probably even some human rules and things but spam detection is an incredibly effective System and I think there are probably some reasons actually why about the particular problem that are make it easier than others that make it so successful and So if you've done any study at all, even casually into machine learning, you've probably come across one of the most famous sort of toy data sets, and
that's the Fisher's Iris data. So Fisher was a geneticist 100 years ago or so, and he took 150 measurements from three different types of irises. So he measures the iris of a human, four different parts of the iris, which are the numbers on the chart there, and then there's also the name of the varieties of irises which are pictured on the right. So a classic example of doing machine learning, or even if you took statistics classes before machine learning became all the rage, you can do this with quote statistical methods which are, the distinctions there are pretty blurry also. But the question becomes, can you train a mathematical model to receive these measurements as an
input and output the type of iris as the prediction? And the data on the left, this is four dimensional data so the charts on the left are an attempt to visualize the four dimensions in a two dimensional sphere by the, just looking at pairs of dimensions together. And so you can see in each of the dimensions, the data is somewhat separated. The colors are somewhat separated from one another. And so you could, even just with a pencil, go and draw a line that says, well, I think this line roughly divides this data. And then machine learning is essentially Teaching a computer to draw some sort of optimal line in the higher dimensional space. So this is a really common example. It's very popular people say
oh here's Here's how machine learning works and you can go you know pip install scikit-learn with Python and you could Train a whole bunch of different types of machine learning models on this measure accuracy and say okay great Here's a machine learning model. I can make predictions and do all these things and But the reality is, even though conceptually there's a lot of overlap between this problem and cybersecurity problems where you are really just thinking of whatever data you're dealing with, whether it be your web logs or your binary files or something, you're just thinking of a mathematical representation, a vector space representation essentially. You're transforming your data into numbers like these and saying can I learn patterns in those numbers?
So there is that similarity and yet the realities of a quote toy problem like this and using machine learning effectively in in a real world production environment are just worlds apart in terms of what it actually looks like. My desire here is not to be like a gatekeeper for machine learning and try to scare people off or discourage you from going out and just take an introductory Coursera class and look at problems like these. Really, if anything, my desire is to Encourage you to be a skeptic as you hear companies Marketing spiels about their machine learning and artificial intelligence and and hopefully ask questions better and also just be able to evaluate the claims that they're making And to know that
silence has solved all these problems and our product is perfect If anybody's listening several aspects of this problem that distinguish it from many real world problems and then more specifically from cybersecurity problems. One is we're dealing with a static data set here. So if you're doing machine learning models on this, you have 150 data points. You could go collect more data if you wanted, but if you're simply operating on the toy example, it's just easier to deal with data that you know and that you know isn't growing. Similarly, it's very low dimensional problem. So we have our four dimensions of data and it's just easier for us as humans to conceptualize. It's easier for models to find patterns in. It turns out it's actually, there are
a lot of things about low dimensional data the analogies don't extend into higher dimensions. And that's a whole another talk on itself, but we can conceive things about two, three, four dimensions, that if we think about hundreds of dimensions, the things we might assume don't apply. The data's mostly separable, and by separable we mean if you, as I mentioned before, if you look at those graphs, you could draw a line between the colors and do that pretty effectively. One of the challenges of doing machine learning on more complex problems is defining your features, which is your numerical representation of your data in a way that makes it separable. The data is pretty interpretable. I mean, Fisher is, these
are measurements straight from your flowers. Interpretable data makes it easier to interpret how your models are operating and understand how they're operating and understand when they're failing. If you wanted to get more labels to your data, if you only have 150 data points here, say you needed a thousand data points, you could pretty easily, if you can find enough irises, you could pretty easily go out and collect more measurements. And similarly, if you knew Actually, don't know how Fisher's measurements work, but if you understood how they worked you could train someone else to go make those same measurements pretty effortlessly That turns out to certainly not be the case in cybersecurity and finally this problem is
somewhat stationary which is a technical term in the statistics world, but When we think about stationary, it's more like how we just use it colloquially. It's that the data isn't changing. If we added more flowers or we had to do this in real time, so say we're a iris processing factory and we had a machine that could take these measurements and we wanted to know what types of irises they were, it's reasonable to expect that the measurements not going to be coming from a different distribution of data over time. Maybe the flowers regionally vary or even seasonally vary, but we would hopefully expect that an iris now and an iris a few years from now of the same variety is going to
have measurements that are from the same type of distribution. So, all that to say, cybersecurity is not a bed of irises. The iris problem as a toy example is helpful in understanding some aspects of what machine learning is, and yet it really covers up, that's not fair really, it just is not as complex as the reality of machine learning and real world problems. I think covers up as unfair because it's not, Fisher wasn't making any claims that his data was applicable for those trying to apply machine learning to cybersecurity 100 years ago. You might have heard about deep learning, one of the popular things today. You might come along and say, well, yeah, this was true before 2009, but now we have convolutional
neural networks, or now we have recurrent neural networks, so It's actually easy now. That's just patently false. Deep learning is really extraordinary at solving some problems very well, and yet most of the caveats I'm gonna give here still completely apply. So don't let people trick you into that. And for the most part deep learning is not categorically different from the things that people were doing prior to it. It has some It does have some advancements, but it's not it's not just completely different. So nine challenges to applying machine learning to cybersecurity in the next 25 minutes One is defining the problem. So we could think about all kinds of definitions to cybersecurity. Here's just an arbitrary one. Cybersecurity entails the
safeguarding of computer networks and information they contain from penetration and from malicious damage or disruption. In Dan's talk this morning, he was saying one of the challenges that you need to accomplish for effective threat intelligence is to really understand what problem it is that you're trying to solve. And in my relatively limited experience with cybersecurity coming into this world in the last few years, you realize that it's easy to say terms like malicious or damage or disruption and yet to concretely define what that means in a way that you could teach a computer to recognize those things can be significantly more challenging. It's not necessarily black or white how you define those things. Charles Kettering, who is
head of research at General Motors, in the very early days and is famous for a number of things in that space, has a quote, "A problem well stated is half solved." And that could not be more true in applying machine learning to your cybersecurity problems. That just getting to where we know what problem it is we're trying to solve is at least half of the work in actually solving the problem. So in the malicious web bot space, well, there's two immediate questions you can ask. You have traffic coming to your website and you want to block malicious bots, the first question you have to ask is what is a bot? I mean, a bot could, Does it strictly mean a computer operating independent of
a human operator? All kinds of difficulties there. And if you define it, can you detect it is another question. And then secondly, what is malicious? What's a bot with malicious intent versus a bot without malicious intent? In many ways, that depends on who the end user is as to how you're going to define that. Also, you could think of a sense in something might have both malicious purpose or intent and not I mean Yeah, you can get into ethical questions there and they're just all kinds of questions that have to be answered before you're deciding how to do that in the malware space, you know, what is malicious software a Temptation for a company blocking malware is to prevent
a any system from running auto IT because you can do all these dangerous things with auto IT and yet many people use it as an important part of their toolkit in running their companies. It's an example of software that has both good and bad uses. And can I necessarily know, I want to try to come up with a definition of malicious software, just because I define it, can I actually know it when I see the software? How do I know that I can know that? And then similarly, does the definition of malicious stay constant over time? I mean, you could, somewhat simple example is you could think of a piece of malware from 20 years ago or
something that that can no longer run on a system or maybe it was malware that operated on a system That we know has been patched to where the malware can no longer Execute the malicious functions if it can't be if it can't do any damage Do we still consider it malicious and you could think that can get more complex? examples of that but All that to say is, nailing down what you actually mean by the problem is very challenging. And going to a data science team, potentially speaking from experience here, going to a data science team who aren't domain expertise, aren't domain experts in your particular realm of cybersecurity and telling them, oh, go block this malicious thing, go prevent this malicious thing, and
then just kind of hemming and hawing when they're asking you what you mean by malicious, what you mean by damaging, that just doesn't work. There's no magic thing in data science that knows what malicious means. Similar to that, supervised learning is effectively pattern matching. I presented this talk to my colleagues this week and my one colleague's complaint was that there were no dog gifs, so this is the best I could do that was somewhat but anyway, doing machine learning requires a sufficiently large amount of accurately coded examples. So you have to, you're training a computer to recognize patterns, it has to have some kind of ground truth to recognize those patterns from. And in most cases, it needs ground truth from both what
we call positive and negative examples. if you're talking about malicious versus not, you need a good number of examples of malicious things and a good number of examples of benign things. And this is one of the biggest challenges for anyone applying machine learning. I saw a relevant tweet last night, quote, "From every CEO ever, I thought AI meant "we didn't have to label data anymore." And that's just not the case. Having data, good labeled data, is still essential for for doing machine learning. And this is one of the areas that in the space of image recognition, deep learning has really made changes because it has essentially reduced the need. It's allowed the models to learn more
things from unlabeled data and then learn faster when you then have labeled data, but that's deceived people into thinking that we just don't need labeled data anymore. So in the case of Silence or anyone doing machine learning in the anti malware space, we have who are looking or executing files in sandboxes, who are looking at decompiling files, doing all this research into whether or not things are malicious. We're collecting examples of non-malicious software to just store as examples so that we can learn, we need to be able to learn the patterns both in malware and in non-malicious software. This is a huge challenge, it's very expensive, and this is absolutely essential. Going back to the iris example, you're talking about how irises,
you could really train someone to label iris data by taking the measurements. It just requires more advanced skill set to train someone to distinguish between examples in the security space and more advanced skill set means more expensive This is what is definitely one of the hardest problems in this in all of my examples Challenge three. What's the cost of predicting incorrectly? So machine learning models are never going to be completely right all the time if you want to If you want to never Incorrectly say that something is malicious with your machine learning model You're just it's never going to be able to say anything at all if you if you're making predictions There it's going to be it's going to be wrong at times. So I
From the business perspective, you need to be able to quantify or at least attempt to quantify what are the implications of a machine learning being wrong, machine learning model being wrong, particularly if it's making automated decisions. So again, using examples from spaces I've worked in, In the malicious web bot space, you know, your machine learning model's wrong. Suddenly on Black Friday, it starts blocking people from accessing your website who are trying to buy items that you have on sale and you're immediately losing revenue because you're blocking people from your website. Similarly, in the bot blocking space, if you block humans incorrectly, they're gonna get mad at you and they might not come back again. or you might fail to block a, there's
an advanced botnet, you fail to block it, it DDoSes your site. That can be very costly. And I think in those cases, well the loss of revenue is easier to quantify. Sometimes I think the cost of being attacked might be harder to quantify. In the malware space, obviously if we fail to block malware, our customers get ransomware on their computers, they blame us for not blocking it very costly to them, very bad for our reputation. We block things incorrectly, and this is actually important nuances if we block something incorrectly we might annoy people a little bit, but in the Malware space you can then go in and like whitelist something it doesn't have as detrimental effects as
like maybe blocking someone from buying something on a Black Friday and the the the nuance there is that there can be a huge imbalance For a malware provider. There's a big imbalance between blocking something incorrectly that's not malicious or failing to block something malicious. So failing to block something malicious could have these huge detrimental costs. Blocking something that's not malicious is mostly just annoying unless you do it too much and it really damages your reputation. And why that nuance is important is as you're training machine learning models, so as you're fitting your model to your data, most machine learning models, If you just use the default settings, are assuming that those two costs are exactly equal, that the cost of a misclassification, a false
positive or a false negative, are the terms used, are equal in cost. But the reality is probably in most spaces that's not true. And then, Beyond that the challenge is how do you actually quantify those costs or measure them to evaluate your models and evaluate whether or not they can be used in production I'll try to refrain from saying at each point that this is one of the hardest problems because these are all these are all real challenging problems Number four, model predictions decay over time. The technical term is model drift, that machine learning models being used in largely human systems and maybe other things, tend to get worse over time unless you're continually updating them.
So one of the advantages of Silence as a product or potentially other machine learning driven antivirus, anti malware tools is we can actually predict We can block malware in the future from models that we train now because we're not looking just at databases of hashes, we're looking at patterns inside malware. So we can stop things in the future without having seen them before. That's one of our huge selling points. And yet, if we just stop now and said, well, we can predict things in the future, we can block malware that has not even been written yet, just stop now and rest on our laurels, and drink a lot of beer, our models are just gonna get worse and worse. So
there could be all kinds of examples for that you could think of, or reasons for that that you could think of. One being the malware authors are gonna learn how to work around our tools. Or simply that the operating systems are changing and updating, so maybe we're looking at patterns that just aren't applicable in the future. Yeah, so there could be other reasons there as well, but if we just stop and use these trained models They're gonna get worse and worse. So once you start doing this you have to stay on top of it and this is not just in in the Cybersecurity space but in any type of machine learning models Dynamically updated models are rarely used if if someone tells you that their machine learning model is
is Learning all the time as as new data comes in they're probably actually lying to you That's what every marketing team wants to say and there are cases in which you can do that but the the challenges of doing that are typically insurmountable for a number of reasons and Sometimes just the structure of the machine learning models aren't conducive to what's called online learning, which is that they're learning from data as it comes in. There's way too much risk in the models being gamed. If you're doing that, you could, you have a model that's recognizing malware and someone does something to just throw tons of examples at it that are forcing it to go off course. Yeah, and just a number of
other reasons, but this is a really, really, really hard thing to do. Everyone wants you to think that they're doing this, but it's virtually impossible in many applications. Exception being sometimes you can make some local adjustments. Something like, you think of Face ID on an iPhone or something. They have a machine learning model that's making these local adjustments to adapt to your face. That is actually a real-time example, but it's, there's still like underlying, it's making changes to small parts of the model where there are underlying things that are having to be static and then updated in a more controlled environment.
Adversarial actors have an incentive to bypass your models. So this, someone made this video when I was working in the bot mitigation space. So it's this robot arm clicking the I am not a robot box. And I mean, it's just, you know, it's silly, but the, There's so much reality behind this as to what is going on. So the computer has cookies that Google thinks is good, so it's not quite as simple as this. But the reality is that if you are trying to stop someone from doing something that they have an incentive to do, they're gonna try to find ways around your model. And this is something that we spend a lot of time on in the malware space is making
our models so that you can't just add a bunch of junk code. You have some malware and then you just add a bunch of junk code that doesn't do anything or maybe doesn't even execute and all of a sudden your PE file looks different and so then it the model fails to classify it as malicious or things like that. We spend a lot of time hardening our models to prevent this. But malicious actors have an incentive for whatever reason, financial or otherwise, to get around your security products. They're going to continue to try to find ways around it. And there's a subtler thing here, which is that repeated access to model predictions allows adaptive behavior. So if
you have your machine learning model accessible in some sort of oracle form, which unfortunately an antivirus or anti-malware tool is, either in simplistic ways or actually there are more kind of advanced mathematical ways, people can, you know, - Yeah, that's a great way of putting it. They can model your model and they can intelligently decide how they throw data at your model to see what it predicts to then try to find out ways to work around it. This is basically unavoidable in the, any malware space, but in other spaces, you might be able to limit access that someone has to your model. But you're, in some sense, giving away as you make your predictions visible. Challenge six,
feedback loops. Blocking malicious actors changes the nature of your data. So the bot mitigation problem is a really good example of this. So you have a bunch of weblog data that you manually somehow magically classify as malicious or not and then you want to train a model to recognize future logs as malicious actors or not. So then you put your model into production, start running it, and then you start blocking all this malicious traffic, which is changing your log data so you have less malicious traffic because you're, And a reason for that is in web log data, a malicious actor doesn't just have one log line, they have hundreds or thousands or millions of log lines. And if you're able to effectively stop them, they might just have
one or two. You see them a couple times, you block them out of your network or from making their HTTP requests and they're gone from your data. Then your model drifts, as we talked about. Your model decays over time in quality. And so you need to go retrain your model. And you don't want to retrain it on old data, you want to train it on new data. but your new data is biased against the malicious actors because you're removing them from your data. So you're then training a model on data that's biased from the naturally occurring data that you would have apart from your mitigation strategies. So this creates a feedback loop that if you don't take into account, you retrain your
model and suddenly you get worse predictions and you I have to explain why. There's a really amazing paper from Google researchers called Machine Learning: The High-Interest Credit Card of Technical Debt. Great title, and it's just a great paper. If you're interested in a production machine learning system, or if you're doing it already and you haven't read this paper, you really should go read it. It's very approachable. It's not highly technical, but they talk about this feedback loop problem and then just a number of the other challenges. of running a production machine learning system that may scare you off from really doing that. Challenge seven, data access. I'm gonna speed up a little bit so I have some time for questions. By the
nature of security problems, we're dealing with data that's restricted. Are you as the researcher gonna be able to get access to your data to learn patterns from it so that you can make predictions of the future, from the future? Similarly, can you get access both to malicious and benign class data. In some examples, someone's happy to give you their examples of malicious data, but they can't share with you any benign data 'cause it's like their private business files or whatever. Somewhat unexcitingly, GDPR is changing this also in that if you have customers in Europe, they might be legally prohibited from allowing you to take their data into the US to train a machine learning model on it and then send the machine learning model back to Europe. We're hiring
malware analysts in Ireland specifically to look at data that we're not able to bring to the US. It's just the reality of the way it is, but it's actually changing how people are going to do business. Are you going to be able to get access to the data that you need?
Challenge eight, model interpretability. Many classes of machine learning models are inherently difficult to explain and by that we mean the predictions that they make, if someone comes and says to you, well why did it make that prediction on this piece of data? Even defining what explainability means is a challenging thing on its own, but assuming you have some definition, some machine learning models do such convoluted things with your data that it's very difficult to give any explanation other than just like a literal showing what it did, which is like, oh, well, we just modified it, or multiplied these things by these decimals and did these mathematical operations and then we got a result, which is meaningless to anyone who
really is trying to understand why that happened. But if you provide some sort of security product that your CEO dogfoods, something's going to happen where your CEO sees something get blocked or something not get blocked or something along those lines that he doesn't expect or she doesn't expect and they're going to come to your data science team on Monday morning and tell me why did this thing happen. and you get to be Mary Poppins, the neural network, and just walk away and say I can't explain this. But analyst teams, your customer success teams, your customers, potentially regulators in some spaces, are gonna want to know answers to these questions, and if you haven't put forethought into this, it's
gonna be very hard for you to provide those answers.
Yeah, this is a huge question in machine learning in general that people are wrestling with, but it comes up a lot in my work. There's more to be said there, but I'll save that for another talk too. Finally, this is my worst title challenge, non-Paredo model transitions, and this is probably also one of the harder things to conceptually understand if you haven't worked with machine learning, but Machine learning models, as you update them, which we call retraining, they can introduce regressions in the classifications, and this is regressions not in the statistical sense, but more in the software testing sense where your model that once made a correct prediction, if you update it, in some examples, might
suddenly start making incorrect predictions. And even worse, for some types of machine learning models, because most machine learning these days is like a probabilistic type method, they're depending on these statistical algorithms for finding the parameters for training them. You could have a machine learning model that you train the exact same model on the exact same data and you in some cases, even if it's a very good model, in some cases you could be getting different predictions just because of the nature of the probabilities. So this can introduce security holes. We deal with this in the malware space. You want to retrain your model and you you've always, your model's always blocked some known malware thing and then you train
your model and for whatever reason, kind of going back to explainability, it might be hard to explain, it suddenly stops blocking the model because we're dealing with probabilities. Or it suddenly stops blocking the malware. Or maybe You release your new model and you're blocking malware and all of a sudden your new model is blocking Microsoft Word on all your customers' computers. This isn't not like it's universally worse. It's not like it's just getting everything wrong. The reason for choosing non-parito as the term is it's just that it's probably not gonna be better in every possible instance. It's gonna be better in a lot of instances, but in some instances you're gonna have these regressions where your model suddenly starts being incorrect. There are ways that you can then
deal with this, but however that is, the reality is that you just have to deal with it. You can't just let this happen. In many applications, probably there are some when you can get away with that.
So, in conclusion, nine ways that machine learning is challenging to apply to cybersecurity problem. One, just defining the problem. Two, getting labels on your data, having your data being labeled as malicious or not or whatever your taxonomy is. Three, what's the cost of being wrong? Can you even quantify it? And if you can quantify it, how expensive is it going to be to quantify? Then can you fit it into your models once you've done that? Four, models decay over time. In the real world, you have to deal with models getting worse as the world around them changes. Five, adversarial actors have an incentive to work around your models. Six, bias feedback loops are gonna change your retraining,
your updating of your models unless you control for the feedback that you're getting, recognizing that as you act on your data, you're changing the nature of the data. Seven, can you even get access to the data that you need? Can you get it out of Europe? Will your customers allow you to have it? Do you have to work on their system? Can you put it in AWS? All these questions. Model interpretability, once you get a machine learning model and it starts making predictions, are you gonna be able to tell why those predictions were made and what they mean? And nine, model instability. Your models as they transition might suddenly make bad predictions that are gonna cause
you trouble. I'm happy to take questions. I don't know on the time. - We got maybe one or two minutes. - All right. - In terms of the decay and instability, Yeah so the question is can you like use machine learning to detect interesting things and then maybe somewhat manually put those into an expert system or something like that. Yeah, so there's a lot of possibility for that depending on the application. You could certainly do that like a manual review. You train your machine learning model and then it's detecting things that you're, you're manually evaluating. That completely is application dependent. But there's certainly a lot of opportunity for, in many problem spaces, for kind of the human machine
learning model interaction to... Yeah, I mean, you know, it's basically like Google, right? Google is a machine learning product that's just... Ordering your cert. It's doing the the search and then ordering the search results, right? So it's putting the things at the top of you think might be interesting There are a lot of analogies to using analysts and products machine learning products the other. All right, we're gonna wrap it up there I'll be around for a little while people have more questions. I'm happy to talk. Thank you again a coin and a local product here deal beer and If you still haven't gotten your t-shirt and you have the survey, go get your t-shirt. And then after that, we're going to start
giving out t-shirts for the people that did not do the survey. We're also going to be doing today, we're going to be doing a silent auction. And in a bit, we're going to start collecting for that. The top prize in the silent auction today is from Chris Nickerson. Lars, consulting, is going to donate a free pen test. To anyone that has passed the PCI audit... Up next, categorical correlations as probabilistic rules with Nathan Donovan with Data Machines Corp at 11 a.m. Up next, categorical correlations as probabilistic rules with Nathan Donovan with Data Machines Corp at 11 a.m.
Up next, categorical correlations as probabilistic rules with Nathan Doneman with Data Machines Corp at 11am.
- That was nerve-wracking. Hi, I am Nathan D'Annaman. I'm gonna talk a little bit about this thing that I'm calling Rule Breaker. Really, it's just exploring categorical correlations as probabilistic rules. That'll make a lot more sense to you in a minute. If that makes you yawn, just kind of hold tight for 30 seconds. Okay, a bunch of huge thanks. The B-Sides people, staff, especially Daniel who helped me figure out what on earth I was going to be doing here. You guys are awesome. You lined up great speakers. Big thanks to the earlier ones. I'm going to stand on the shoulders of those giants a ton, especially Tim who said, who made a great... Nine great
points about why machine learning for InfoSec is really really hard. I'll be jumping off from that a lot. I'll also talk a lot about, jump off from the first talk about rules and inducing rules from indicators of compromise. and kind of marry those things a bit. I didn't realize that this thing was gonna be ordered this way, but if this was the intention, great job. Big thanks to a company called Data Machines. I work for them. They're kind of a small outfit. They do kind of statistical consulting. They also stand up fancy private clouds for customers who need that kind of stuff. We're not a product company, so I'll be talking about this thing that
I've given a name to, but you can't buy it from me, but I can give it away to you for free, in part because Uncle Sam helped fund part of it. Big thanks to the government doing lots of kind of high-end R&D in the cybersecurity space. It's mandatory that I have that gray stuff on the slide. There it is. Okay, so we're gonna talk a little bit about big data and analytics for cyber defenders. And that's gonna really riff on some of the things you've heard this morning already. I'll talk about this thing Rule Breaker, which I like to say with like, I don't know, some kind of bad Scandinavian accent, right? Like, rule breaker. We'll talk about how it works and what it is. I'll talk
about how to try and make it security relevant and its use cases. I'll talk about next steps a little bit because this is active research. This isn't a beautiful shrink wrap thing that I'm extraordinarily proud of. Again, it's not a product. You can't buy it from me. DataMachines doesn't sell anything except people's research time. It's nice I get to make fun of people who sell things, which was me as of a couple years ago. Really brief background, I'm kind of an applied statistician, so perhaps much like Tim, who I think is no longer here and can't defend himself, which is perfect. I know, enough I guess about mathy stuff and have picked up a lot of networking stuff on the fly over the last four or five years,
but networking, computer, cybersecurity, it's a really, really big space. So I broadly claim a lot of ignorance there. So if I say something silly, just give me the benefit of the doubt or come tell me later. We can dream up new use cases At the end of this, what I'd like to do, what I will do kind of in brief, is show you how to code this thing for yourself. It's easy. Even if you're not much of a coder, it's still pretty easy. So that's kind of the goody bag at the end of this. Hopefully that'll keep you engaged for no more than 30 minutes. Okay. I'm gonna set up this kind of topology of
ways we could do cyber defense that are data driven and then I'm gonna break it. Just kind of for kicks. 'Cause breaking things is fun. So one way you can do defensive cyber analysis is with rules. So the idea here is let's encode knowledge in a bunch of kind of if/then statements. These are often like known bad actor TTPs, which I'm told stands for Techniques Something and Practices. You can also get these from IOC lists as the first presenter talked about at 8 o'clock this morning if you caught that. So an example kind of tooling here is something like Snort and you get rules, hard and fast ones that look like if some IP and
some port or some other domain, then block it, report it, alert on it, stick it in a queue and never look at it, whatever you're gonna do. So these have some real pros, right? These things are understandable when some jerk face like me hands some real cybersecurity expert a possible issue and they say, "Well, what makes it an issue?" You can just say, "Look, if bad guy domain, then bad guy stuff." If it's a bad guy domain, it's really easy to understand. They're targeted, they're fast, both to implement, though maybe if Dave was here he'd be telling me I'm making too small a thing of that. They're also super fast to execute, right? You can
do these often at line speed. Big cons, though, these things are pretty simplistic, so they sometimes leave some things to be desired. Bad guys can kind of side-set these things pretty easily by changing small characteristics. Another big con here is the kind of known knowns thing. So if you have a bunch of rules you can catch known knowns pretty reasonably, but you're kind of helpless still with respect to kind of things you don't know about and that's either because IOC lists come in kind of slowly or they're bad actors. They're just slightly more clever and they're kind of changing things on you. We could talk about catching zero days that that'd probably be a lie,
but we'll talk more about that in a minute. Okay, defensive analytics two. So you wanna like, not necessarily level up, but take like a different approach. Maybe we do statistical pattern matching, which was the whole thrust of Tim's talk, right? The idea here is you get a bunch of examples of, for instance, malware carrying PDFs and a bunch of benign PDFs. You learn some statistical model that differentiates those two. You put that model into production and it says, that's a good one, that's a bad one. That looks like the good ones I've seen, that's a bad one. Just one of many examples of people who do things like this, FireEye, Cylance, there's a slew of
these guys on the market right now. Statistical pattern matching, aka supervised learning, is sweet in that it is targeted, so you're gonna have these really specific models to try and separate bad guy stuff, MS Word attachments from benign ones, and they can be quite accurate. In spite of all the very real problems Tim alluded to, you can get pretty high accuracy out of these things. Of course, if you're getting it right 95% of the time, you're getting it wrong a lot, right, because people are swinging a lot of files. Cons though, as Tim really nicely alluded to, you need training data that's labeled. So you need examples of, I don't know, regular people, MS Word
files, and a bunch of bad guy containing MS Word files. You also get, like he also very aptly pointed to, you get these crazy uninterpreted models. So I send results to the cyber SME that I sit next to, and he or she goes, "What in the, what is this? "Why is this bad? "Why are you sending it to me?" we're gonna have to have a long awkward conversation where I basically acknowledge that I don't know how to interpret my own models. These things also have a really short shelf life. I think, man, that Tim guy really crushed it. As the data changes, as the threat space changes even slightly, your models get stale really quickly,
just like your rules also subject to the same problem. And you need many models to achieve broad coverage. You need like, and this is totally painting us Childish version picture of this but you need like the Microsoft Word 2003 separating model and the one for word of seven and and and and and so you need tons of these models that you have to kind of keep all of them up to date and Deal with all the issues of them becoming stale and interpretability like times every model you have in your whole system Yikes, okay, so there's like another way you can kind of use data to try and defend networks I guess I should have
mentioned I worked at a company in Maybe or maybe not that company for a long time doing this and other things and recently I've been kind of doing cyber R&D for the government for a bit. So that's kind of where I have the kind of visibility to say some of this stuff, maybe authoritatively, maybe not. The idea here is let's use a statistical model to describe normal data, right? If I can describe normalcy kind of ipso facto, anything that doesn't fit tidily in that bucket is anomalous. I know IronNet does it because I wrote models like this for them. Or you can also try and model the anomalousness of observations directly. Huge field on that.
If you care, I'll point you to other papers or talks. Pros here is you get these really cool dynamic models that can update really easily over time. You get per environment customization, right? So I don't try and apply some generic model to all these different enterprises that have really different stuff going on. And you can detect unknowns, which is super sexy. Like when you say that in a sales pitch or to the government, they're like, "Oh, unknown unknowns." That's like, I don't know, the thing. Problems, big problems. These things are super noisy. And that kind of fact derives from the really simple notion that every, Weird feedback over there. Everything that's anomalous is not necessarily
malicious. The internet, because of super casual design and usage patterns and because we need our admins to have broad freedoms, do weird, highly anomalous stuff that's totally benign all the time. If one in a thousand things is anomalous and you have big data, so there are a bajillion things going across your network, all of a sudden you have, you do simple math and you realize that there's gonna be a ton of aberrations, definitionally. Your definition of what is an anomaly implies an algorithm for detecting that type or class of anomaly, which implies what your result set will be. And TL;DR anomaly detection is really, really complicated. It's easy to mess this up in ways you won't know about. And you also get these terrible, uninterpretable
models. Okay, so that's the paradigm I've set up just to kind of knock it down. I'm gonna talk about this rule breaker thing that gets you hopefully some bits and pieces of the power of all three of these approaches and to some extent, we can discuss this afterwards. Again, it's nice that this is not a sales pitch. it sidesteps a lot of their pitfalls. So the idea, the high level idea behind this thing I'm gonna eventually get to the point and describe to you is that we're gonna generate simplified models of normalcy in the form of probabilistic rules. So I'm gonna look at your data and in your data If port 53 then DNS likelihood
99% of the time we're gonna infer that from your data And then we'll go back through your data and try and identify observations that break that rule and we'll talk a lot in a minute about how to identify security relevant rules as opposed to just correlations nobody cares about You get dynamism, you get some unknown unknowns here from the anomaly detection part of this. You get the interpretability of a rule so I can pass these results to the cybersmuse really gracefully. And they say, why is this weird? And I say, look, because in the data, if A and B and C, then D, but here, if A and B and C, then F. And they
can use their mental model of networking to decide if that's bad or not. Then you get some of the accuracy from statistical pattern matching. These rules are probabilistic instead of deterministic, which means they hold most of the time. Then again, the cybersmith can determine whether they're like, "I give a crap about a breakage of this rule," or "That rule is really silly." A little bit more motivation. So network data, because of RFCs, convergent use, habits, you often get these strong interdependencies between fields or variables in your data. So think, if you're looking at NetFlow data from YAF or Bro, or you're looking at, I don't know, host level data from insert name of your favorite agent here, you get these strong correlations, right? File extensions that imply or go
with MIME types, ports and protocols. The second bullet is kind of neat because here we can start to talk about ranges of continuous variables like return bytes, right? So like if I'm doing port 53 and it's a quad A record like and you get back a bajillion bytes, that's kind of atypical. Port numbers correlate with protocols and so the high level idea here is what if we can automagically identify strong categorical correlations, right? If port 53 then probably DNS. To learn about a network, and this is great if you're on a consulting team or like some of the folks I've worked with in the past on the government side that go to different networks every
month. If you work in the same sock for the same corporate entity for like the last five years, you probably know your network inside and out already. Maybe that's not cool to you, but then you still get the second thing. So we can then, with these probabilistic rules identified, again, really simply, even if you're not a coder. You can then identify breakages of them to learn about anomalies, and then we'll talk a little bit about how to pick the rules and the breakages that are actually security relevant, right? To get you over the give a shit threshold. Okay, hey man, can't we just find people who are using DNS but not on port 53 with
a SQL query? That was the first thing my analyst said to me. It's a super good critique, right? Like, hang on, fancy math. I've got SQL, I've got a database. Get out of town. Short answer is yes, if. So if you already know what mismatches you might care about, so if you already know you wanna look for things not identified as DNS by your protocol inspector over port 53, then you just query that, right? You know, select star where, I don't know, port equals 53 and protocol not equals DNS. But if you don't know what to look for, again, unknown unknowns, SQL doesn't help you there. And then, yeah, if there's only a few mismatches
you care about. So if that list of mismatches you care about is like of order tens of them, sure, query them up if you think that's a good use of your time. Maybe script that querying to automate those things. That's nice, but what if there are an arbitrary number of unknown mismatches that you might care about? You can't SQL that. Okay, and isn't this just anomaly detection? Like I kind of like laid out point one, two, and three for defense with cyber data and I said this is kind of like overarching blah, blah, blah. But this is really just the third one. Sort of, it certainly starts out that way, right? We're certainly gonna model
normalcy, but we're gonna do it in a really particular way. This thing is optimized for categorical data, which that just means instead of a numerical thing like height in inches, it's a categorical thing like color in the RGB kind of oh that was a terrible example like color in the Crayola 8-pack and This thing is really performant in the face of irrelevant variables. So you can put in lots of fields and if there are fields or variables that don't correlate with anything else, this thing doesn't care. Lots of anomaly detectors don't have that characteristic. And this one, importantly, is focused on near misses, which makes it really potent for security purposes. So a lot of
times, because of adversarial intention, your adversaries aren't fielding inputs that are tremendously anomalous They look just right in a lot of ways but are off in like one feature or something and this thing is optimized to detect that. Okay, let's talk a little bit about the language of rules because what we're gonna do is we're gonna induce them from data. So Rule Breaker thinks of categorical correlations, right? Again, I'm gonna go back to this kind of trivial example and we'll talk about all the complications that arise in a bit. But if port 53 then protocol is DNS, right? That's a strong categorical correlation, right? An example, oh there's my toy example, and it kicks out a probability statement for that. This is just a simple conditional probability. Note that
these are descriptive rather than normative. So this is not a thou shalt rule, this is I saw in your data this correlation holds rule. So breakages of it are just breakages of correlations. Rules, I'll talk about this using these words. They have three parts. They have a predicate, so if predicate. They have a consequent, then consequent, and then they have a conditional likelihood, so like, bah, bah, bah, that last little thing. Again, if you're thinking, oh God, how do you code that? You're gonna get all that for nothing if you use Python R or Spark. We'll talk about other things you can layer on top of that that you might wanna code yourself in a minute. Okay, so at a high level here's the kind of
game plan. Association rule mining is kind of this big area of data analytics that goes back like 30 years, and it's the task of identifying antecedent sets highly likely to imply a consequent. So find these correlations. That's the name of the high-volume thing if you wanted to Google it. Like what's association rule mining? And all the packages in those programming languages I talk about reference it as such. So it originated with market basket analysis. So you're running a supermarket and you want to find strongly correlated items from sets of things people buy. So you can co-locate those items in the store. Because it's really annoying when the pickles are with the sandwich stuff and the
olives are elsewhere. But in your mind, you organize that store according to things that go in cocktails. And those things should be together. So for example, bread and peanut butter usually implies jelly. There's a citation if you want to go look at an early one. So the task involves identifying frequent item sets. So things commonly bought together for our use cases, right? Cyber things that tend to correlate. And then you use those to identify rules. Two-step process. Again, you're going to get both of these for free, so I'm going to be brief about this. Okay, frequent item set. A frequent item set is just a set that occurs frequently. I love it when things have
good names. You specify some level of support S, which is the frequency in some big data set that a set must meet before you are gonna call it frequent, right? These folks, Han et al., back in 2000 came up with this excruciatingly clever way to find these things that's parallelizable and wicked fast. If you're into algorithm development and you want to see just a pristine example of data structures as algorithms, go look up this paper. I read it like four times. I'm still not sure I understand the ins and outs of it. It's just lovely. But basically you scan a database once to find a sorted list of frequent items, singletons, and then you grow
a tree based on relative conditional frequency and you use that tree structure to rapidly kind of pull these things out. Again, if you're into that, great. If you're not, you don't have to know this or code it yourself. You're gonna get it for free in some package. Okay, so suppose you have all these frequent item sets, right? Peanut butter, jelly, bread, they go together all the time. A rule is a unary split of one of those things where the likelihood of the consequent given the predicate has some minimum confidence. Oh, God, these words. So an item set might be A, B, and C. A unary split is any split that leaves one of those things
the end and so rule might be something like if A and C then B with probability 96% So that's what the algorithm is doing under the hood. Find frequent items, check all the unary splits of all the frequent enough items and see if they land with a high enough probability that you care. Alright, so rule induction is that thing. Okay, no one is here to learn about association rule mining. We're all here to try and like learn about cybersecurity stuff We get two things from association rule mining, right? We learn about patterns in the network so This client uses DNS and they because they think they're clever they pointed over some odd port right Particular
services might be strongly associated with particular usernames etc And we can get all that kind of stuff for free which is great if you're going into a new network kind of blind Well, which has been my experience more times than I care to say and Then again, we want to do this kind of anomaly detection task to find all the bad guys, right? We're gonna identify discrepancies So maybe there's this if port 53 then protocol is DNS some high percentage of the time, right? That's the rule that second thing is a breakage. So you're gonna sweep back through the data identifying sets that have the Predicate and not the consequent, right? So I know how
IPS work I shorten all the IPS just to make sure I don't accidentally in my toy examples point to a real IP. So like 123 sent ICMP traffic to 456 over port 53, that'd be a breakage of that rule because the rule is if 53 then DNS, but here we have 53 in ICMP. So we're gonna find things like that and you get those, all the rules and all the breakages, you kind of get them for free and that's nice. Okay. But these are just correlations, correlations especially in really big data are everywhere, most of them are meaningless, how do we generate ones of these that are security relevant? If you're thinking that, you're asking
exactly the right question. So a couple ways, choose the right features, make sure the rules are statistically sound, and then automate the dropping, right, the trashing of uninteresting rules and breakages. Okay, so some security relevant rules and breakages. One thing you want to do here is choose security relevant features or sets of features where breakages of correlations they're in are going to be of interest to you as an analyst, right? This is you using your noggin. Math doesn't get you this. This is where your kind of creativity and expertise needs to come in. So for example, an example that I run everywhere because it's nice and intuitable is to check for strong correlations in the inferred file types. If you're using NetFlow parser like Bro or something,
it'll say, hey man, here's this file. I think it was a text file and the file extension was .txt. We think those things should match up. We know they're security relevant because bad guys like to stuff, I don't know, an executable into a PDF. Again, if you knew that one use case, you could just find that with SQL, a SQL hook to whatever your database is, but if you didn't know all the ones that might be security relevant, you can use this tool to have them served up to you for free. Some other kind of examples I use, please don't furiously jot these down, we'll make sure these slides are public. Okay, the other thing
you can do is check for your metrics. All of these packages give you this thing called confidence for free, which is that stupid thing, the support of A and B, right? So the count, the number of times peanut butter and jelly are bought together, divided by the support and the count of A. This is directly convertible to a conditional probability, so the probability of B given A. This is nice. This one's better. The only difference here is we've added this little multiplier by the support of B, so we penalize really, really common Bs. What does that mean? Here's a garbage Venn diagram from five years ago when data science was new. Suppose we find this
rule, right, like if machine learning then data science. The confidence is going to be really high because all the machine learnings, hypothetically according to this author, is data science. Confidence equals 100. But what you'd find with the lift is that everything on here is data science. So some rule that is if something, then data science is not all that interesting. So lift kind of goes in and penalizes those really common Bs. So lift is a better metric. Again, you get lift for free and like R, Python and Spark you have to write your own. It's not hard. Okay, and then the last thing you're gonna wanna do, even if you've taken these steps to pick good features and use the right metric for figuring out which rules are
cool and which ones you're not gonna care about, you're still gonna get garbage rules, and I don't mean incorrect ones, I just mean boring ones. Like if the mime type is text or HTML, then the extension is HTML with a high probability. You find some breakage, uh oh, someone's got a mime type of text/html and extension.txt. It's the bad guys. It's not the bad guys, right? You don't care. So the team I'm working with now, we implemented an exceptions list. which is a way to kind of whitelist rules, parts of rules or breakages that we know we don't care about. So we run the same use case a second time, either on the same network
or a different network or the next day or whatever, we don't have the results we don't care about served up to us, right? So we could say like, hey, anytime the rule has like, I don't know, .html and the breakage has .txt, don't show me that. So it's a really simple way of pruning these things. We wrote like this Awful little kind of mini language to get parsed into that thing. I'm way more of a statistician than a coder. Sorry, I'm not sorry. Okay, here's a quick example. So again, because a lot of this was done on the auspices of the government and on people's data that they won't let me talk about in detail, I'm gonna be just a tiny bit vague here. So sorry, the
part where I didn't show you all the thousands of baddy bad bad guys I caught with this thing, I can't point a lot of fingers, but just take this as a nice exemplar. We think MIME type should be tightly coupled with file extension or vice versa. The rules, they have an arrow, but there's no sense of causality or directionality under them. And we know breakages of this might be interesting because of anecdotal evidence, that thing I just mentioned. It says PDF, but look what's under there, it's an executable. Uh oh, so we have anecdotal evidence that this kind of breakage might be interesting. And then we've got the unknown unknowns problem, right? I don't just
wanna find where someone's stuff is an executable into a PDF, I wanna find all the examples where someone has stuffed potentially weaponizable things, I'm not even sure I know, I can't enumerate that list, I certainly can't, some of you could probably start to, and stuff weaponizable things into file extensions that are more typically benign looking. So let's go find all those things in my example data set, which hopefully I've anonymized appropriately. If you're looking at bro, one of the breakouts, and that's just a kind of a very standard NetFlow parser, you get this breakout that strips out file names, so you can pull the, and the inferred mime type. So what does bro think this
file is coming up the wire? Here's some of these example rules that I found in this one data set and you read this as antecedent implies consequent with confidence. I'll just point you to... the second to last one. If the application is an executable then the file extension is DLL 0.999% of the time. What are these things without that file extension? Let's go find those automagically. Also just point to those top two rows and note that rules are often kind of bi-directional but not necessarily the same confidence so most of the Hmm, I'm sorry rose two and three most of the dot XMLs are application slash XML and all of the application selects slash XML or dot XML you often get those kind
of relationships because they're strong categorical correlations, but having different competencies in both directions Totally typical. Let's see previous been diagram question You said turn it up yeah that makes more sense all right Okay, so here's some example rules from some example network. We dug into one of these findings that Roy just pointed out to you. One of the breakages was this thing, right? So some IP pulling down that file type with a .txt file extension. Note that this is kind of a starting point for an analysis where some of you cyber security professionals can then dig in, pivot from the IP or the file or the hash or whatever and go do the magical things you do. This is basically
a high quality tipper, I think, I hope. This is usually the starting point for an analysis, not kind of the end point in most cases. But hopefully again we've got we've gotten something pretty digestible, right? We did fancy pants anomaly detection, we generated a rule, we found a breakage of the rule, I can send this to a cybersecurity professional who can interpret it immediately using their wet neural network and figure out what to do next. Okay, you want one of these for the like fourth time maybe I don't sell stuff I give stuff away and I make stuff I solve hard problems for pay so we can talk about those things if you care to If you want to infer rules in spark, there's this thing Called ML live
and there's a frequent pattern mining module. So you get frequent item sets and rules For free in our there's this a rules package in Python. There's the frequent patterns import a module for that. The rule checker, so what do you have? For free you get rules, which is a set and a consequent and usually a confidence level. If you wanna check those rules you gotta write that yourself, but it's super easy 'cause you just say if my original set has the predicate and not the consequent, rut row and kick that thing out to some file. If you look for rules that involve lots of variables, so I've had all these toy examples with a single
predicate and a single consequent, right? If port, then protocol. But you can have very complicated rules like if A and B and C and D, then E. A lot of times what I found in practice is those things tend to be nested, right? So if A and B and C, then D. if A and B then D, if A then D, if B then D, and usually the most interesting one to look at is the most complicated rule that was broken. So if you go roll this at home this afternoon in a couple of hours, add that feature, you'll be glad you did. Also really simple. Confidence is baked in in the ARM Python. R
and Python have lift and Spark, you have to write that yourself. The formula's quite simple. The exception list, you're gonna want one, again, that's very easy. You just identify with whoever you're working with, your SMEs, or you are the SMEs. The rules and the breakages that are boring, and write a little way to not be shown those the second time. Okay, here's a part where I am at pains to tell you a couple of the good things this thing does and then remind you of all of its limitations, right? Lest you think I'm another person saying I've solved cybersecurity with math. The strengths here are its broad applicability, right? So we have lots of categorical features in cyber data. Continuous things can be made categorical by binning.
It's only really limited by your imagination of intelligent and informed use cases. You get great explainability here. I've worked with SMEs both on the commercial side and on the corporate side and for government in these SOCs. When you hand them garbage or uninterpretable findings, they hate you. This doesn't have that problem. really fast both to implement and to execute. So it's quick in Python and R. It scales in Spark which is really nice if you wanna hit like the last six months of NetFlow data from your big, I don't know, your enterprise holdings or something. And then tuning this is trivial. Tim didn't talk about how machine learning often has these hyper parameters. So they're
like choices you have to make about which algorithm and how to set up the algorithm to find the things you're looking for. Tuning can be really hard and this thing doesn't really have tuning parameters except the confidence level that causes you to care. If you've done machine learning before you'd know that not having a million knobs to turn to get this thing dialed in right is nice. Okay, and then pitfalls. We talked kind of ad nauseum about how high confidence rules are not necessarily interesting. Pick the right features, use lift instead of confidence. You have to use categorical data for this thing. So you can do binning if you've ever thought hard about how to
bin continuous things into categorical things in a way that's not lossy, you'll know that that's non-trivial and there are lots of ways to do it. You can't do it without losing something. This tool only identifies what I'll call orthogonal outliers. So I think of that as a strength, right? So we're looking for these anomalies that are only different in one way. They're not like broadly different than a set. If you have that kind of problem, like show me the things that are really, really weird, this isn't the tool for that. As of yet this whole setup it doesn't handle a multi-item consequent so all the rules are these unary splits if A and B and C then D not then D or E and so if you have a
categorical correlation That has multiple parts at the end of it. You can't find them this way You know that's kind of future research for me and the team and where I'd like to go next all right, and that's it question Yeah. Not sure I follow. Yeah. - Yeah. - Bread and peanut butter I got jellied, but it's no good finding, oh, it's also mustard and tomatoes and all this other stuff. - Yeah. - If you wanna find those in the same data set, I run it through and it says jellied. - Yeah, so I think I'm tracking your question now. Could you-- - Take all the stuff I rejected and just run it through again and get mustard
on the second pass and-- - Yeah, so that's a great thought. Could you kinda, the question was, could you identify this kind of multi, multi-item consequence by using some kind of recursion here. The answer is I think so maybe the problem is in the original set so imagine you would have found a rule that's like If tall then basketball player or if tall then pole vaulter, right and each of those happened 50% of the time You can't find that rule unless you lower the confidence level you're looking for down to 50% at which point you're gonna find a lot of illusory correlations But I can't run it and have it say pole vaulter or basketball player? - You mean like that's the output of the two things? - Could
that algorithm be written? Absolutely, I plan on it, let's talk. Can you get it canned like this afternoon? Not to my knowledge. - What about on the AAA? - Yeah, okay. - The new service that you stand up with legitimate traffic, how do you-- - Yeah, the question was what if you set up a new service with legitimate traffic, how do you make sure you don't fall afoul of this thing? So there's several ways to kind of go about that, right? One is don't generate rules the first day of the service, right? And that's a whole, we're now kind of abstracted away from okay, that's a nifty method, how do I use this thing in practice, right, operationally? Lots of options there. One, induce rules like
Don't know one day and then use those rules for the next couple of days or use it post hoc, right? So for yesterday I found these strong correlations and these breakages That kind of helps you get away with the new service problem if you want to hold rules and like use them like into the future you're gonna have to be savvy to the fact that you're you've changed your environment and again you can I'm gonna kind of Default back to that exceptions list right you hand a bunch of things over to your smi and you say hey I found these breakages of these rules they say oh I don't care about anything that involves from
before the 10th because that was new that day. - So if you decided, oh yeah, this sounds really cool, where do we go next? GitHub? - Yeah, I want it and I want it now. That's awesome, I'm glad it was mildly convincing. So you can infer rules here, so that's for free, no coding, kind of a cumen needed by you, you're gonna have to write your own hooks to get data from wherever it lives, your sim tool into Python R or Spark or something. The rule checker is, again, you write it yourself, so it's just if this set I'm looking at contains the predicate and doesn't contain the consequent, That's a one-liner. That's an additional one-liner that is useful. Three, four
is you get lift for free a lot of the time, but if you want your own lift in Spark, you have to write it. I don't know if I'm gonna be allowed to kind of share the code out from this project yet. If yes, then I'll try and make blast out of this community. If no, you'll write it yourself and that one's the only one that's not kind of a one liner and then this thing is something you'll for now make for yourself. Yeah? - I thought that what you might be saying but I also thought well maybe rule breaker is the thing I go, yeah, this is the secret sauce, this is the design
pattern. - Yeah, and it's not secret except insofar as I haven't been given the thumbs up to kind of like push it out there. Other questions? There's like a hint of barbecue wafting in and I think everyone's ready to get out of here. All right, well that's all I have. I'll be around through the end of the day so if you have questions kind of later come find me. Otherwise, I'm not gonna be between you and your lunch. Thank you.
This 2018 B-Sides Asheville live stream will be breaking for lunch and returning just after 1 with our next presentation at 1 p.m. with Jason Gillum with Secure Ideas on Folding Steel. We look forward to having you join us again.
And we're back! So up next is Folding Steel: The Samurai WTF Reboot with Jason Gillum of Secure Ideas at 1:10 PM. And we're back! So up next is Folding Steel: The Samurai WTF Reboot with Jason Gillum of Secure Ideas at 1:10 PM.
And we're back! So up next is Folding Steel: The Samurai WTF Reboot with Jason Gillum of Secure Ideas at 1:10 p.m. Okay. The new YouTube channel has every video from every conference by the hour of the presentation online with text in there. Go back to the last four years and see every video. You can binge on Besides Ashford. I don't know if you guys have this.
And so yeah, go ahead and unmute. I just did. Oh, you just did. I'm going to turn on the backup recording. So this is the-- Just in case stuff goes wrong. Yep. Fantastic. Welcome, everybody. I hope you had a good lunch. My name is Jason Gillum. This is my brother Mike. And we are going to talk about Samurai WTF. Some of you have probably heard of this before. The project is actually, this is its 10th anniversary this year, so it's been around for quite a while. And what we're going to do is show a little bit about where the project has been in the past and then where we are taking it now and what our future plans are as well. We
do have, I'm going to start off with some slides kind of going through where it was and then Mike's going to jump in with some live demo-y stuff. Hopefully it all goes according to plan and show you what it's doing now. So the SAMRI, WTF, stands for Web Testing Framework. I promise that's what it was when they came up. And we're in the process actually of changing this because although it started off with this idea of, hey, let's put all of the tools, kind of like a Cali thing, right? Put all of the tools in one spot for doing web app testing. What's actually happened is over the years it's evolved into more of a
teaching environment. So it's used very often for when we're teaching classes on web testing because we actually include several vulnerable targets in the build. Okay, so this is a, these days it is a virtual machine based solution. You can bring it up, it's got targets already in there, it's got tools already in there, and you can play around with it in practice, so on and so forth. It started in 2008. The boss, big boss, CEO of Secure Ideas, Kevin Johnson, he started the project because he was bored. So he didn't get his talk in at DEF CON or something like that. And then he started messing around with this idea. And then shortly after that, Justin Searle, he joined as well. And
at the time, they were both working for InGuardians. So I don't know if any of you are familiar with InGuardians. That's where they were working at the time. Now, in its first iteration, It was a CD ISO. This was Kevin playing around with a tool called Remaster Sys, which I understand is basically bash scripts or something like that. I've never used it. It had some cool ideas here. For one thing, we didn't have to install everything. Anybody who wanted to use it didn't have to install everything themselves. Everything was inside of a distribution, which is pretty cool, right? That's what we want to do with these things. Because it was a live CD, you could
put a CD, you remember the round things that you could put inside your computer? I can't do that anymore. It looked like this right here. You could slide it in there, boot up the computer, and you would have, it would just pop right up in front of you, and then you had an installation right on the desktop, and you could install the rest of it that way. So really cool. Usually works, especially if you can boot from a CD. Could also, because it's an ISO, you could also set it up to install off of a thumb drive, which is something they started doing later. And it was hosted on SourceForge. And you can still find
up to version 3.0 on SourceForge as well. So that's out there. It's not a live CD out there anymore. It's a virtual machine, and I'll get to that next. There are some problems with building an ISO for a distribution like this. Maintenance is a bear. Can you just imagine if you have multiple people working on a project, but every time you make a change, you have to basically build a CD. right to and test it make sure that's all working so um there were some issues with that uh it also made it very difficult to kind of backtrack if you made a mistake on something you've already written it to an iso and then you
try to install it's just a very lengthy process also consider that we're talking about a time frame when computers were slower than they are today so it took even longer than you would expect to to get through all of that so the next step was hey There's got to be a better way. We moved to VMware. So there, for several years, Samurai WTF was maintained as a VMware image. Lots of advantages there. For number one, you could snapshot things. you know, go so far, try it out, oh, that didn't work, back up, restore from the previous snapshot, and that didn't take too long to do. So that was really good. We didn't have to build out an ISO each time, but it also had
some other problems. So maintenance is still a bear, except this one has snapshots. and collaboration, but collaboration sucks. So we have three, four people trying to work on this project at once and the way this would happen is because I was involved in these three or four people working on it, it would be like one person would say, "Oh, I can't, I have possession of the VM right now, the gold version of it." I'm making some updates because I'm teaching a class in a few weeks So no you can't go and make any changes right now And so we'd be you know trying to coordinate all that and if it's just one or two people
you might be able to make it work once you get to more than that it just gets it just wasn't working it was very difficult and we so we from here what we did is we tried a few other solutions So I played around with that. I'm what a little over a year ago. Where was it two years ago? I can't remember Two years ago. I was playing around with packer. I/o is one option I had a Debian package so you could hopefully you know install that But that had a lot of the same problems as building out an ISO really because you still had to go through this this Every time you made a
change you had to rebuild the entire thing So that was problematic as well and In the end, what we were looking for really is a way that we could collaborate on something with change control, maybe something sort of like source control, and be able to easily rebuild an environment from scratch. That would be handy. So in the end, what we decided is this. And this is actually an idea that Mike started with. very thankful for that. This has made things a lot easier. So now, Samurai WTF is in its own GitHub repository. And most of it is a collection of script files for setting things up and a Vagrant file. And we'll go through exactly what that is because I'm sure that Some of
you probably know what Vagrant is. Hands? People have used it? Okay, so maybe a quarter of the room at most. So Vagrant is, basically it's tied to either VirtualBox or VMware or whatever the provider is, and you have a script that says this is how to build this VM. And so you just basically, Mike's going to demonstrate this, but basically with a couple of quick commands, it just pulls everything you need and stands up the machine for you and it has everything all installed and all ready to go, which is pretty awesome. Now, from testing to teaching. So we started, like I said, this was initially a testing environment we were going for. We want this to be more of a teaching environment. So one of
the things that we have to maintain on here is our vulnerable targets. And we start running into issues if we have, if we try to put too many targets on there, it gets really hard to kind of manage which targets where, which port is a target on. Because what we're talking about here is vulnerable web applications. So some of you are probably familiar with WebGoat. We don't currently have WebGoat installed. But what we did is we installed several using Docker containers. So this is kind of the new way of doing things. This is what I've been told. We have some hipster developers on our team now, so they told us Docker's the right thing to do, so that's what we did. So... Right. So... So we
have Docker containers for several targets already built. So DVWA, Damn Vulnerable Web Application, the most recent version of that. We have Matilde2, JuiceShop, and then these two at the bottom, if you've ever taken like a SANS class or any class that features Samurai web testing framework, it probably goes through Dojo Basic and Dojo Scavenger. And so those are older applications. We include them to keep it complete, but also we're trying to improve on those a little bit as well. But these other ones at the top, so DVWA and Matilde, they're actually very similar in their style. So if you're familiar with the OWASP top 10, top vulnerabilities, and just want to focus in on certain vulnerabilities, those two applications are a
great way to do that. Because they actually have a menu of here's OWASP's top 10, number A1, injection flaws and then it goes through a whole menu item of all the different types of injection flaws that are built into there that you can practice. They even have different difficulty levels on them as well. So from easy, there's no controls preventing the SQL injection from happening to a medium where there's some controls in place that won't be straightforward to do the injection to difficult or impossible. JuiceShop is what We'll show you that briefly. Juice Shop is actually really cool. It's my favorite vulnerable app right now. It's a full-blown app, so it's not a menu style of vulnerabilities. Instead, it's an
actual online store for buying juice, hence the term Juice Shop. And it has, it's built as a modern application. So this is a node-based app, it's single page, it's very, has a very modern feel to it. So it's not like all of these other ones here, they work for what we used to consider, what I would call like traditional web application pen testing. So if you throw an automated tool at something, it's going to be able, at one of those, it'll be able to find injection flaws and cross-site scripting flaws fairly easily. But if you try to do the same thing with Juice Shop, it might not find them. because there's a lot more logic on the front end, on the browser side, which is what
we're seeing a lot in applications these days. So Juke Shop is, if you're doing any web app pen testing, you should definitely look at Juke Shop and try to get yourself very familiar with how to do a test with that type of application. So I guess at this point we will switch over to... demonstration stuff so I'm we're gonna trade mics here because we only have one mic and Mike needs a mic all right can you hear me good all right we're gonna exit a presentation mode because that's not where the demo lips I'm gonna start by hopping into the kind of live running samurai VM this is from actually last night's training so you got burp
in there he's done any web testing that'll be familiar The scrolling, by the way, is for presentation purposes. Normally when it's on your machine, what I would suggest is full screen it. Bump the resolution and full screen it. But when you're projecting, that's sort of a different issue. So here we were doing some cross-site scripty type stuff. Targets, targets are there. We got rid of, so KDE was the desktop environment for old, up to three, Samurai. It was Ubuntu based, it was KDE-based. And so one of my frustrations with it that kind of led to this was it got big, got really big. So I had thumb drives, I had a training class coming up,
I built a new target and it bloated up my VM and I couldn't fit it on my thumb drives anymore. So I went and used the old version of it and then after I got through that training class I went home, got pissed off and started ripping it apart and rebuilding it, and in vagrant. So I cut a lot of that kind of desktop environment weight by going to, so this is Openbox. It's got a menu. You right-click for that. It's, yeah, it's a lightweight kind of window manager. It's got a couple of utilities, but not a lot baked in other than what you're actually using. I guess that's kind of it about that part. These I actually bookmarked
for last night's training class. They're not, currently they're not baked in, the bookmarks. But these will preload on the first time it comes up. These are browser extensions that are commonly used in our training classes at least. You have Foxy Proxy there, Wapalyzer, Retire.js. If I go on over to Matilda, it gives you that sort of information, stuff that I commonly use actually on pen tests as well. It's all kind of nicely baked in. And then we've got these dock-rise containers for the app. So if we have... Current targets, future targets, anything that has a conflicting dependency, we have some level of segregation between them that gives us a lot more flexibility with the targets. And this is, I mean, this is nothing kind
of too fancy or magical here. This is the environment, really. But it works well for its purpose. That's the Burp Suite Community Edition that's baked in, along with Z-Attack Proxy, the OWASP man-in-the-middle proxy. And if you ever lose track of what your targets are, You can always, can you read that? Yeah. You can always cat out your Etsy hosts because they're there. So nothing too shocking or surprising there. Some of those were also customizations that I put in for last night's training class. They're not quite ready to be baked in yet, so they're not in the master repo. That's just the bottom two that I'm referring to. The other ones are there. So the way it works is there's an Nginx reverse proxy that
- Say something, I think we should be-- - Something. - Okay, let me ask. - Something else. - What? - Say something else. - Oh yeah, I see the audio level. - Is that good? Bad, good, bad, louder. - I'm working on the screen when it comes. You need to repeat everything to the left. I can do it and I will. Okay. Welcome back everybody on the live stream. So what we went to We went to GitHub. We opened up the Vagrant file in GitHub. I'm going to leave this up because nicely if you look at the very top right now, you can actually see where the Git repo is. I think we have that on a later info
slide as well. I'm seeing a head shake over here. So that's there. It's nicely indexed by things that do search indexing and stuff. So it should be findable if you use anything that's not Bing. LAUGHTER That was a cheap shot, that's not fair. All right. So yeah, Vagrant file, it describes a VM. In this case, VirtualBox is the main provider we target, although there's nothing fundamentally wrong with using this to target VMware. Vagrant can do that, or Hyper-V, as long as the base box supports it. And the particular base box on this one right now is to Bento, which is managed by Chef. which is another orchestration tool. I'm seeing some head shaking, yes. I get that
a lot actually in the office. Somebody will throw out a random word and I'll say, yeah, there's a tool called that. It's sort of a running joke with Kevin. So it's kind of a vanilla Debian box that we build everything up on top of. And you see there's a shared folder there. It has the VirtualBox guest additions baked in. Usually that works. Sometimes there can be version mismatches that cause problems. And currently, we're kind of manually dealing with that. But long term, we'll be looking to automate that part of the solution. So if we go down into some VirtualBox specific options, the name, to bring up a GUI instead of going in headless. And then we have provisioning scripts below that, bash
scripts that are run on the VM when it is built, when it first comes up to provision it. There's an odd naming convention to them, probably because I'm unconventional. But so there are-- if you scroll down further, what's actually happened here is-- I've got to zoom out to unscroll. What I've done is I've set up multiple different VMs. One of them is a default. There we go. All right. Everybody motion sick now? Good. So there are multiple, there are actually a couple of different builds of the VM that call only a subset of the provisioning scripts and then there's the main default one that calls all of them. And essentially that's broken down into just
a target server which I would caution is not tested very often. Although you can always raise an issue on GitHub and get a fixed premise pretty quickly or directions on how to fix it yourself is probably more likely and then you can submit a pull request which would be awesome. But also it's helpful for us if I'm trying to do something with a target, I don't want to rebuild the whole machine every time because some of those provisioning scripts also like update packages and stuff. It can take a while. For me, somewhere around five minutes probably, which is still a much faster way to do iterative builds, but it's not super, super fast. To sort of solve that, I've broken it down. I've got a target server
in there that calls those provisioning scripts. I've got a desktop environment that just does those, just the desktop, no targets, which allows for faster iterations when doing development work on it. And so it's essentially, it's the same layout as the one we just looked at, but calling a smaller subset. Instead of four provisioning scripts, there are two for the user environment. So that's... I mean, that's your Vagrant file really in a nutshell. In terms of future direction for it, we have some ideas, some pretty, I think, somewhat ambitious ideas, at least I like to think they're ambitious, such as having like a YAML config file that is, these are the tool sets and these are the targets that I
want for this particular class, generate this, give it to my class, and it will build it with just those particular bits in it. possibly even taking that to the point of having an interactive kind of script when we run the vagrant command to say, well, you don't have a config file. Do you want all the targets or do you want to choose them? That sort of thing. Because it's still, it's Ruby. If we can figure out how to write that in Ruby, then it should be possible to do. I believe people have done some work in that area. I don't know why you're looking at me. Because that doesn't mean you won't write it. Okay, we'll zoom in a little bit here. We're not gonna watch the
whole build because as I said, it takes several minutes. But I'm gonna do a quick, so this is one I've renamed this machine so it should show as a separate machine. Hopefully this doesn't like explode or anything, but if there is something that's gonna explode in live demo mode, it's going to be this. I'm gonna move that magnifier out of the way 'cause it's in the way. Run a vagrant, up command it. So first thing it does is imports the base box. In my case, it's got that cached on my system because I've been using that base box. That looks bad. I have two of them kind of going at the same time. So that's probably what
that is. But yeah. That's because it didn't clean up properly last time when I tested building this box. But really, truly, if you run that command... it just kind of does its thing. It grabs that config file, it runs through it, it does the provisioning of the box, and then you end up with a GUI with a login. It's a text-based login. We didn't actually put a login manager on it. So it's just your standard Linux login, and then it will load into the desktop environment. And it's really that simple. When you're done, if you were successful in building it, which you will be, trust me, you will, then Vagrant Destroy, in this case, obviously it's saying that...
The other ones were not created. This one says was, and you just, yes, and it should, should clean up correctly. Just take it out. It'll take it out of virtual box. What will still be on your system is that Debian base box. So next time you build it, it won't have to fetch that again. That's kind of it. Unless you go back on the build now, it should work, right? You know, we'll see. We'll see. I'm not, I don't believe it will because I believe I already did that. Yeah. The fix when this happens, by the way, is to actually find where VirtualBox has put that machine and blow it away, which is not as difficult as it sounds, but I'm not going to do
it right now. Yeah, so what are the provisioning scripts doing without going into incredible detail? Because really, you can go and look yourself if you want. So I've got this broken down into-- I'm zooming. Sort of stuff is grouped into two directories. There's the install directory, which is stuff to run. There's the config directory, which is random config files that I don't want to figure out how to generate on the fly, so I've just written them and included them in the repo. But also that way they're properly version controlled, and we can see when somebody changes a config and it breaks stuff. We have that kind of audit history on it. So quickly, what sort of stuff is in here? Well, this sort of stuff. So
what open box starts when it comes up. There's the definition for the menu that I got when I right clicked. some various scripts that are available to it. Somewhere in here there's a, ah, there it is, nginx, nginx config for the reverse proxies. That sort of stuff is grouped in under here. And now that you know it's there, you can actually go and pick it up and do stuff to it and make it better and submit a pull request, 'cause that would be awesome. There is, the install stuff is those provisioning scripts that you saw. doing various different things. I'm not gonna walk through each of them 'cause that's gonna get dry and boring really fast. But
you can see here stuff like setting the host name. So some of this is a little more on the kind of Linux-y technical side. I'm gonna look for something that's a little bit less so. Yeah. So some of this is just stuff like installing Chrome, for example. Simple package install-y type stuff. Java runtime, there you go. So WP scan also uses Docker. It's the way that we fetch it. There are other ways. That sort of thing. Pretty simple stuff, just bash script fetching stuff. If you're familiar with the Linux environment, none of this would be challenging for you to write yourself, but you don't have to because it's here. That's... Kind of the bulk of it. This was
the so that's the user environment one the targets one and I'm the reason I want to point out the targets one is Because I want people to add targets. I think that would be great if people would add targets. So There's a target bootstrap here So our teammate Cory who's floating around the conference somewhere over there. Yes. Yeah, you know, he's our he's our he's our Docker hipster and So he's done the DVWA and Matilda ones. He's pulled in the Docker wrappers for them. So fetching them at this point is just pulling the Docker images, which he's nicely indicated with some echo statements there. Juice Shop is from Bjorn Kimenich, who is the... project manager basically, it's an OWASP project. He's also by far the
main contributor to it and doesn't sleep and like does five new builds a day and as far as I can tell is a cyborg or something. But it's a fantastic target environment as Jason said. So and then there's the really kind of sketchy samurai dojo stuff that might or might not have my name on it. Where I kind of, so it's a wrapper that has the original maintained project as a sub repo, Git sub repo. So it pulls that from its original source and then wraps it up in Docker and builds a container kind of on the fly. So there are different ways of solving different problems there. But that's really all there is to actually the whole thing. It's scripts for
building now. Currently, yeah, scripts for building it, config files, a Vagrant file which is really a config file written in Ruby. Someone on the internet was asking are there any Windows targets? Windows environment like Windows containers, Windows operating system? The question is any Windows targets question mark. I'm going to take that as a yes. Windows servers, currently there are not. If there is a particular one out there already floating around that we can pull in, that would be great. Otherwise, When we write it, we'll, you know. Which one? Yes. Yeah, that's a good point. If you have requests like that, features, ideas, we'd love to hear them. Head on over to the GitHub and open an issue. Not if the targets are licensed in
a way that is shareable. It's not a commercial product. No, I meant if it was a vendor. Yeah, potentially. Yeah, we'd have to figure out licensing for it, which currently it's not a path we've started down. I can see the appeal of it. It makes a lot of sense, considering the number of environments we test that are mostly Windows servers, right? Of course you can. They got core now. The question was can you run .NET on Linux? Why? That would. You're supposed to repeat the question. I'm not. Sure. So my teammate Alex is going to come over and do a Vagrant app. We're going to see if it breaks twice or redeems itself. This could
go either way. All right. So he's furiously typing in a dark room with code projected onto his face. So we're currently in the Samurai WTF folder right now and we're going to just do Vigrant up. One two three and...
If you don't have the Debian image, it actually goes and fetches it. One thing to watch with the Bento ones, although they're actually... So HashiCorp is the company that does Vagrant. They do some commercial tools for infrastructure as code type stuff. One thing... They recommend using the Bento ones for Ubuntu and stuff. But one thing to watch is Debian 9.4 is a separate base box. So... If that's changed in my Vagrant file for Samurai and I go into a Vagrant up, I'm going to end up with two base boxes sitting on my system and probably want to go and clean up the old one that I'm not using anymore. Unfortunately, in that case, it's not really automated. If a Vagrant box with the same
name is updated, you get prompted to download the update if you want to. So as long as it maintains the same name, it's sort of version controlled, but not if they rename it like they do for that particular project. On the your right stage right stage right side of the screen there's a It's opened up. So this is sitting on a command line The machine has actually come up but it's still doing a whole bunch of provisioning you get to that part Don't mess with this window. Just kind of leave it there People will try to log in right away because they've got a login prompt so it looks interactive is not really ready to
go and It will restart at the end and when it restarts it will actually have the Samurai WTF hostname. That's how you know that it's done. Or by watching your output window doing your vagrant stuff. So that's just happily running away. I saw Docker Community Edition just got installed there. That red stuff's not bad. It's most likely just warnings. So something is fetched through a w get and it outputs in red I think is one of the big ones that gives you a big block of red. It's not necessarily errors. There can be errors. Some of them are okay, but uh, that's not one of them. Um, yeah. So that's really kind of, what other things we want to do with it?
Well, what else do we want to do with it? Oh yeah. Yeah. So, uh, Yeah, in terms of future direction, better support for some of the other provisioners would be great. That's something we'd like to do better. Hyper-V stands out because people that run Hyper-V can't also run VirtualBox at the same time. So making them disable their one hypervisor and install VirtualBox to do a training class sucks. It's obviously something that would be better not to have. Packer is something that we're looking at again. It's matured quite a bit, which would allow us basically to maintain a vagrant base box for Samurai. So instead of it doing all of this updating and build steps, it would
have a base box as a starting point that's got everything installed. I like the idea of maintaining both kind of at the same time, not because I like more work, but because I like the transparency of starting with a clean Debian build and seeing everything that's being put into it and what's being run, what's being changed. As a user, as a somewhat security paranoid user, you would probably like the opportunity to see that in some cases. So there's that. Are you running it in Python? No. Rewriting it in Python, no. I'm not gonna rewrite Vagrant. It's already in Ruby until they migrate it to Go, which they haven't announced they're gonna do, but it's a Ruby project, so
it's probably a safe bet. Yeah, so adding in mobile support for mobile training targets and mobile tooling would be definitely on the roadmap sooner rather than later. Somehow I made myself the guy who gets tagged with a lot of the mobile stuff at work, so it just makes sense for me to shove that in there too. I don't know why. I think it started with me rooting a device and then suddenly I'm the guy who does that stuff. So the moral there is don't do stuff that you don't want to continue doing. Find someone to pawn it off to because otherwise you're a victim of your own success or the way my dad always puts it is the reward for good work is more work.
Which my wife would say I subscribe to you by sucking at things like doing the dishes More targets and we actually the targets there are definitely open issues They're open issues and github for several other targets node go rails go up basically a whole bunch of goat once a whole herd of goats and there might be a few others in there currently but I Again, if you have a target in mind, first of all, you can open an issue and say, hey, I'd like to see this in there too. It's a side project, but when we get the chance, we'll put it in. Or you can do it yourself. Submit a pull request, because that'd be really cool. So there's a project that's-- we would like more
input. As long as it's just mainly the two of us working on it-- we've got Alex looking at the Packer options as well-- as long as mainly our little group looking at it we don't have that outside perspective and we welcome it so it'd be great to have other people throw in ideas input if i shoot down your idea it doesn't mean i don't like your idea but the idea of dock rising the targets came out of a conversation while running a training class at a b-sides event like this one somebody in the class said have you thought about doing this and then kevin was there and he was like I'm not really comfortable with Docker. I think he gives pretty honest answers and at the time
that was his feeling. He's getting comfortable with Docker. Or at least with delegating Docker work. So, question over here? - Are there any update utility scripts inside the VM or do you pretty much have to destroy and re-up? - Yeah, you could re-- - A lot of stuff you can get the Docker engines in and run in. Is that updateable? Usually what I would do is destroy and recreate. You can, you could get onto the command line and run your usual Docker commands. Yeah. But it's not, there's no sort of automatic updating. Yeah, that's likely to work. It might also throw some errors for stuff already being there. It's not necessarily going to be graceful, but it should
more or less work. So the suggestion over there was, yeah, do a git pull. and then do a Vagrant provision which will rerun the provisioning scripts on the existing VM. The nice thing about Vagrant is it's meant to be destroyed again. So it's supposed to be re-creatable. Yeah, Vagrant kind of came about as a development tool. It's I'm running my development version of my app, I'm running my development server on my machine and I want to be able to destroy and rebuild it at will. Which is why my kind of attitude and I think in general our attitude is usually we want to blow it away and that's a lot of red. That's dumb though.
Okay. Usually we want to blow it away and rebuild it. I do that for pretty much every training class I teach when I set up the class VM and then generally export it and stick it on thumb drives for anybody who doesn't have the VM, doesn't have the Vagrant build on their machine. But I will say One of the best training classes I did instead of in terms of not having technical issues Was the one where we sent out a message a week in advance and said hey everybody go get this and run vagrant up And we had a class full of people show up with working target environments right from the start of the
class If that rarely happens for anyone who's taught a class in the first hour You're gonna be going around troubleshooting people's issues on their machines a little bit so that was To me, that was sort of the proof that if we do it this way and we are vigilant about kind of testing regularly and going through the process, that we can hand it out to a training class and see it work really well. Your colors are weird. It's because it's tunneled over SSH and VNC. It's because it's tunneled over SSH and VNC. Okay. Yeah. VNC seems like... So there's a terminal. The environment is up at this point. That gives you a little bit of
an idea how long it takes to do a fresh build. - That's a smaller, slower machine too. It builds faster on that. - Yeah. - It worked. - It worked. - Sorry, we're a professional heckler. - Oh, that's cool. Yeah. - Any questions? - Let's move on to questions. Questions? I know we've kind of been answering them on the fly, Around if you have any ideas questions questions come up later It's not something we've discussed going to external targets I wouldn't say it's off the table as at least an option and But most of our focus has been on internal targets because when we're we're gonna teach a class somewhere Yeah, or how good is
it? Yeah, so open a github issue and maybe think about submitting a pull request I
I'm going to stand a little closer to the middle so that we can share. Any other questions? Yeah, sorry, too long. Yeah. I know you use Retire.js inside of Chrome as an extension, but is that only because, you know, the Burp community edition is that below the extension? Basically, Retire.js, is there any reason to use it inside of Chrome when you also have it inside of Burp as an extension? I like doing both. When I pen test, I run Burp on my native machine. I have it installed there as well as a plug-in. In the VM, I don't think we know of a way to automate the installation of Burp extensions. So we'd have to
take the students through manually installing it for them to have it. Whereas with Chrome, I have it set up so that it actually fetches and installs it when it first launches. I was talking about real-world testing. What benefits do you see between the entire JS and something that's fetched? For me, I actually prefer to have it inside of Burp. I prefer running it inside of Burp for the reason that even if I navigate away, the information is captured in history, right? So it's always there. as passive results inside of Burp, whereas inside your browser you have to actually be looking at the page that has the problem in order to be able to see the results. Yeah, I usually don't load up with RSS, I just didn't know if
there was some benefit to it as opposed to just not being able to. It doesn't cost anything, so. Well, it's Chrome, it always costs something. Yeah, I think as far as I'm concerned it comes down to preference, and I do both. Question? So assuming you get as far as on the screen right now, if this is something you weren't doing every day like y'all do, what would you recommend as a ho-ho test to say, yeah, you actually did it right? As a test, so the question was as a test of the environment, once it's up to the state that it is on the screen, how do you know, what do you do as sort of
a smoke test that it's up and working? I would open burp, because you know you're going to need it, and I would open... At least one of the targets in the browser you could you could iterate through all of them Definitely if you know you're going after a specific target, it's an obvious choice You're gonna test that target But those are kind of the main things making sure the targets came up making sure your tools are there If you've got that you're pretty much good to go. Yeah Sometimes so there is how the how the targets come up just because it's it's a sort of a point of interest if you do encounter any issues There's a there's a bash script sitting
in a hidden slash dot scripts folder off of the samurai user's home. Inside of that, so there's that bash script that's set up on a cron job. So if that hasn't worked. Not offhand. I think it's startuptargets.sh. We're out of time right now. We're getting the queue to get off stage. But we'll be around. We'll step out into the lobby. If anybody has any other questions about this, just approach us there. Or open an issue on GitHub. Yes. Thank you. I just gave you this. That's awesome. Thank you. Up next, categorical correlations as probability. Up next, breaking everything with Alejandro Caceres with Hyperion Gray at 2.10 p.m.
Sound projecting? Yeah? You're live, sir. Yeah? Whenever you're ready. Okay. I'm projecting like my mic. That's awesome. I can't tell at all. Anyway. So hey what's up? I'm Alex. This talk is punk.sh: mass scanning all the things. And we are at B-Sides Asheville 2018, just for those of you that did not know that. So, who am I? I know that's a really creative title. I'm sure nobody at a security conference has ever thought of that. Hyperion, I go by Hyperion. It's just a stupid old hacker name, old school kind of hacker name. But you can just call me Alex, so that's cool. But if you call me I've hearing I'll feel like really awesome, so please go ahead and do
that anytime that you want Nobody's done it yet actually He's so sweet. I know his name hey Frank. Oh, I am a hacker pan tester engineer I build a lot of software as well a little bit more in the past. I used to do that but now for the most part I Just break things also exploit writer dabble in that and do a lot of Linux exploitation stuff. So I like building stuff that breaks stuff is kind of the short way of saying that. I co-founded a company called Hyperion Gray, small technology R&D. We work mainly with the ARPAs, DARPA and IARPA, in counter child exploitation, countering violent extremism, counter human trafficking and that kind of stuff. So like basically like really awful shit.
So we are based in Charlotte, North Carolina. We drove out here in our RV with our two dogs and a cat. The dog on the left, on your left, is Callie, and the little dog is called Danger. The cat is basically just Kitty, but her actual name is Aurora. That's just so you get to know a little bit about me. So what is punk.sh? What's this thing that I'm talking about? It's a massively scalable real-time web and network vulnerability finding, port scanning, OS discovery scanning, banner grabbing search engine. And most of the talk is going to be kind of deciphering what the fuck I just said. So don't worry about it if that all doesn't make
sense to you. Has anybody here heard of Punk Spider or used Punk Spider? Sweet. A couple people. That's awesome. How many people have heard of Shodan? Yeah. Okay. So one of my biggest goals for the last five years I've been working on Punk Spider and Punk.sh is to have more hands raised for Punk Spider. I don't know why. I have nothing against Shodan. But I'm really disappointed right now. So... It has a web vulnerability crawler and scanner that kind of powers it. It's called Ferret. It basically is able to make... It's an asynchronous web crawler, so it's able to make a ton of requests at once very, very quickly. It's also... Okay, so that was custom built,
so we actually built that part. It's open source. You can download it, use it, whatever. It used to be a thing called MassWeb that I still half maintain, but slash don't really at all i'm lying to you guys i just i'm sorry but it works and that was our old kind of web application scanner slash library web application scanner library actually yeah it's the right way to put it um our current version of punk.sh uh uses a distributed queuing system called kafka are any of you familiar with kafka i know that guy is no you nathan yeah So I'm sure some of you know a lot more about it than I do. So I'm going to do a very crude explanation of it because it is very
important to the entire workflow of how we're scanning things en masse and giving those results back to you. But the TL;DR of this whole thing is the system is able to find lots of vulnerabilities. It does it really, really fast. and it lets you look through them in a simple interface which I'm gonna be demoing to you in a little bit. I actually have very few slides, I just talk a lot. Hyperion Grey is right there. Basically it all started with the third one was the big one is starting with the DARPA Cyber Fast Track. Is anybody else here in the DARPA Cyber Fast Track program? A really cool program started by Mudge from Loft who basically was giving out small
grants to hackers that are looking to kind of break into the DoD space. So that's kind of been our whole thing. All right. So from that, getting actually into punk.sh and how it works, this is the kind of obligatory architecture diagram. And I am going to go through it. I'm sorry in advance, but... I just want you guys to really understand what's going on with the whole thing. So let's start with the Kafka queue part of it, right? So you see that big like monolith looking thing from 2001: A Space Odyssey. So it's... On the left side, there's a thing called Punk Scanner, right? You see that little box that says Punk Scanner. This is basically what feeds the
inputs to a Kafka queue. And... A Kafka queue is distributed, right? So it has multiple machines, but they're all acting as one queuing system. And I'll do a little bit of a deep dive into that a little bit later. So looking at it, it kind of load balances itself or kind of however you want to look at it. But looking at Kafka specifically, there's a thing called the nmap-input topic. That topic holds about four billion domains that we're going to scan. So the two of those little blue boxes to the right of that topic are kind of the real heroes of the whole piece. There are a series of machines that grab this information from Kafka. In this case, like I said, 4 billion
domains. They grab that information from Kafka, do their thing, which is either using Ferret or Nmap, eventually doing both. We started to do this through a VPN. I'm sorry, we started doing this without a VPN at first. And we started to get banned from a bunch of cloud computing places, which I mean we knew like it was probably coming But on the other hand it's like you gotta try right like you know sometimes they sometimes people Relax about that kind of stuff AWS is not Just FYI for all you hackers out there. I So we started doing it through a VPN. Of course, we get banned from VPN providers as well, but it's so much easier to switch over to other VPN providers than it would be to
switch from a whole other cloud computing environment, right? So anyway... All that is to say is that we use ferret and Nmap through a VPN, which is shown up there on the diagram, and then send the results to the respective output topics where Punk Scanner is ready for them. Punk Scanner is kind of the whole enchilada of ferret and Nmap, right? It's ready for them. It does its thing. Everything is indexed to Cloud Search. Has anybody here heard of Cloud Search at all or used it? Few people. It's not a widely used kind of search engine back end, but it's similar to something like Solar or Elasticsearch. Are those more familiar to people? I'm seeing a lot
of nodding heads and stuff, so that's cool. But Cloud Search, it basically is really nice and I'm not gonna start selling Amazon products for some reason. But it scales very nicely. So remember we have 4 billion domains that we're gonna scan. That means a lot more data than 4 billion domains. Because we're gonna have the domains and then we're gonna have the results of the domains which are gonna be a lot more information. So anyway, that is to say, it's a fuck ton of data. So we needed something that would scale nicely, even though cloud search kind of blows otherwise. It's distributed RDS instance as well, which is another Amazon product. And then, yeah, that's it. Did that make sense? Did that make sense to you
guys? Was that all over the place? Yes, ma'am? What do you mean with RDS? Yeah, RDS is just another... It's another... I guess place to store certain types of information. So it is redundant. Most all of the data that's in RDS is in cloud search. Let me put it that way. So cloud search is a subset of it, but it's a search engine back-end which means that it just kind of has a bunch of canned functions for you to do search engine-y type stuff like search. So yeah, nice. Thank you. And that's my wife, by the way. She asks questions when I forget to mention things that I'm supposed to mention. So you guys just got chilled really hard.
It was awesome. Was I convincing? Did I make it sound like I didn't know her? That's good. Yeah? All right. That's cool. So I already asked who here was familiar with Kafka. I can pretty much skip this slide, but well, whatever. I'll talk about Kafka for just a second. Basically, for our purposes, just think of it as like there's a shit ton of machines. There's an input ID and an output ID. You shove stuff in the input ID topic, which is what they call like just a group of stuff. So in our case, one of them might be like domains, right? And then something is listening on the output topic, which could be something like scan.out or whatever. Whatever you
want to make it essentially. So the distributed nature of Apache Kafka really allows us to do real-time distributed computing. So input goes in, output goes out. It's kind of that simple when you look at everything that's listening in on Kafka, and that's where all the information is flowing through. So it's also automated load balancing of which machine sends output, which machine takes input, etc. You don't have to worry about any of that. You can just kind of connect a bunch of machines and it looks to the developer as just one. So looking at Ferret, this is kind of the history of it. It used to be a thing called PunkScan. I think there's still an open source repo that you can
look at. It's probably kind of embarrassing code. Actually, please don't look at it. That got turned into MassWeb, which MassWeb is actually a library that you can install. You can just pip install MassWeb. It's a crawler and fuzzer, essentially. So it's a crawling and fuzzing library for Python, is what I should say. It doesn't natively work with Hadoop, but it is very Hadoop adjacent. So it's very easy to plug into Hadoop. And that's how we were doing a lot of our scanning before, which was this kind of batch-oriented scanning. So we would take like four million domains, scan all of them, index that information, and then go from there. Anyway, MassWeb is a lot fancier. but
it still was meant to do that. So if you're looking to use Hadoop for any reason, MassWeb for any security reason, and you're looking to fuzz things, for example, I know a lot of crawlers and stuff like that will crawl an application that's just been built to see if there's any 404s that come out. So it might be a good idea to use MassWeb to fuzz some stuff and see if you get back any vulnerabilities and use that as part of your secure development lifecycle. But nobody's actually going to fucking do that, so whatever. So Ferret... Is where we are now. It's Java based. How many of you are Java folk? Would any of you guys? Okay, there's like
there's a few hands that are so you're kind of like a cult, you know, where things are like documented in specific places like it's mostly in the code. Pretty much everything else that you get is like class documentation, which is Just a shit ton to read. It's like reading man pages. Like who reads man pages anymore? I probably just blasphemed a lot in terms of the security community. So whatever. So like I said, Ferret's an asynchronous distributed web app scanner. It's async is kind of the way to go for scanning web apps. It's much faster than threading. If I really looked into it, I could tell you why, but I actually don't know why. Maybe some of you
have some ideas on why. Asynchronous crawling works better than threaded crawling. I would love to know that, but when you do asynchronous stuff, I think it's that the CPU usage can be higher and It's just more efficient. So all you really need is more CPU usage for crawling and scanning Bandwidth is actually not the biggest thing as much as you would think that it would be So anyway, async is the way to go That's all. Ferret also has a special place in my heart because it's named after a ferret that I had The ferret's name wasn't ferret. The ferret's name was something else, but I named it ferret based off of that ferret. So I hope that all of you followed that.
Anyway, so why is this problem hard, right? So all we're doing is I mean like fucking Shodan, right? It has every IPv4 address like why am I not just doing everything the way that Shodan does it right? Why why is this harder than that? So it turns out that web app scanning, first of all, requires a lot of computing power. It's really, really expensive. So if you think about what you're doing with like a Shodan, right? All you're doing is connecting and maybe banner grabbing. With this, you're actually performing an entire TCP connection. actually doing multiple TCP connections all at the same time. You are requesting pages that you have to download and analyze. And then you're also doing the fuzzing piece of
it, which means that you could be hitting the same URL a thousand times or whatever, however hard you're fuzzing. Probably not a thousand. That would be ridiculous. It would be more like ten. Anyway, so distributed scans are complicated, right? You have to coordinate shit to work together all at the same time while also having these computing power constraints. But it can just get complicated. It can be hard to debug. Anyway, enough about that. Yeah, I think infrastructure is kind of the key. and reliable proxies are. So one of the big problems that, I mean I joke about it a lot, but one of the big problems we've had is that nobody wants us to scan
on their subnet, frankly, and that's what it comes down to, and I don't think that will ever change. So we keep kind of having to fuck around and move around, which can get really annoying. Why did we build it? Initially, we were just curious if it could kind of be done. I was kind of shocked at the number of results I was getting. It was something like 5% of web applications have extremely dangerous, easily exploitable, I guess easily exploitable bugs in them. Right? So we were getting a ton of results and we still continue to get a ton of results. We were kind of frustrated at the security industry and the state of it in general. And we
kind of wanted to raise awareness and make it easier to see what the scale of the problem is, how big this problem is. So we did stuff with a lot of big numbers, in other words. We also believe that the information should be available for security researchers, education institutions, journalists. The general public, we think that it should just be made freely available, right? So we're releasing a shit ton of vulnerabilities and not necessarily doing what they call responsible disclosure, right? Which we can talk about that for hours or whatever, but it's not something that I conform to at the moment. All right, so that's enough about that. I wanted to show you punk.sh, but I still need my notes because somewhere I had what
I wanted to show you guys. Well, anyway. A couple things I did want, I know I did want to show you, is that if you're a noob to punk.sh, which most of you are, because, I'm sorry, that sounded mean. I didn't, I didn't mean for that to sound mean. I just mean like nobody knows about this yet. But, uh, Yeah, anyway, so some stuff we can do is we can search over here on the left hand side, right? We can search for types of vulnerabilities that we found, the different ports. This has a bug in it. We found a lot more than four results. Yeah. Oh, yeah, there is another filter applied. It's everything with port 23
and those vulnerabilities. So, yeah, you're right. Let me do this. Error occurred fetching results. Oh, that's a good question. Yeah, I guess I'm not that sucks anyway I'll have to fix that here in a second, but one of the best things you can do is just look down the left-hand side and See what filters it already provides you right so for example ports you want to see the IPs that have a certain port open I could have you know port 23 over here like I had before and see which ones have port 23. That's usually the FTP service, port 22, the SSH service. And by the way, the scans are still being conducted. So the numbers you see here aren't going to be like, you
know, hundreds of millions or anything like that just yet. You know, there's some other interesting ports open. You see the finger port is open. I forget exactly who knows which port the finger port is. 60 are you saying 69 seriously? Okay. Yeah, it's somewhere in my notes But I don't feel like going back all the way back in my notes. Um, anyway, the finger port is open 79 there we go. So let's see if that even works So yeah, no, we're not connected to the internet and that is the issue going on right now I could turn on my hotspot or actually connect to the internet which do you all think is faster? Do you know
it by heart? Collider guest. All lower case? Is it all lower case really? Okay. It's okay. It's okay. They can trust me. Yeah, you should definitely edit this one. No, just edit everything. Like just everything I say. You can bleep shit. You can... Yeah, right. Good point. Yeah, there's maybe worse stuff on the internet than me just saying fuck a lot. Okay. We were looking at port 79, which is the finger port, right? So this, well it says that there's two but there's actually a lot more. We're still working out some of the kinks in this so you know just don't worry about it. But we can look at the NSE results and anyway you can look at the just the specific results
of each NSE script that it ran. So remember this is going to be run on four billion different machines. That's all of these NSE scripts. So All this stuff has been done. It's checked for various CVEs that it's had. And I think I probably just passed over like the finger information. But anyway, left hand side, great way to kind of navigate all of this. Something else that I wanted to show you was that you saw already that There's CVEs in there, right? So we can just do like CVE whatever, and it'll do autocomplete for you. So I'm trying to remember which CVEs I wanted to show you. Oh, I think it was, yeah, 2011. Who can remember numbers well? 3192, right? So
dash, oh, yeah, okay. That's how we're going to play it. Yeah, that's probably the second one.
All right, so 1,571 results for this, right? Now let's look at what that actually is. So it is a Byte Ranger filter, blah, blah, blah, blah, blah, blah, allows remote attackers to cause a denial of service. Essentially, this is just another DOS attack against Apache. You can kind of bring anything down that is vulnerable to this. So that would be like all of these versions that it mentions. So not great, right? So that's not the best thing in the world to have out there and to have so many out there as well. And obviously our scanners are still running. This is still in the early works. So we haven't scanned those 4 billion domains. That's kind
of our larger end goal. Another CVE that I wanted to show you was, it looks like 2015. What am I, where am I? Why are things happening that I'm not telling you to do? I'm supposed to know how to use a computer, right? Okay, something's gone terribly wrong and I'm gonna open a new tab and go to punk.sh. You've gotta be kidding me. This is just getting ridiculous. Oh, is that right? Oh no, everybody stop doing things. 429, too many requests. Wait, so now if I reload this page, it's not gonna work, is it? Anyway, there was another CVE that I wanted to show you. There's a lot of results for it as well. And this one
is essentially, you see here, http.sys remote code execution vulnerability. So we have thousands of results for that as well. You can look it up, but not right now, not right now. So, um... Yeah. I'm going to try to reload. Why? Why? Why is this happening to me? It's a demo. Yeah, that's why. Man. I'm sorry. I'm enabling my hotspot. I'm not, like, calling my mother or something like that. Uh, yeah. I think I'd like just watch the devil's advocate or something like that. You know the one with Al Pacino great movie. All right, so we have punk.sh back online and What else did I want to show you with it? I guess that was it. Here we go. Okay, web results. Now it's actually
working. So these web results, what are we scanning for, right? So everything else you saw was Nmap, right? The other stuff that we're actually scanning for are blind SQL injection, OS command injection, really shittily, by the way, but the rest are good, so, you know, forget it. SQL injection, path traversal, XPath injection, and cross-site scripting. That's the stuff that we're searching for. It's nothing extremely complicated to check. It's just injection vulnerabilities, right? So that's what we are doing. And this is ferret again. Let's see, what else? What other information did I want to give you? Oh yeah, it's separated by country, not for any reason. And I won't click on any of them, or maybe we'll start like
a political discussion that I'm probably not prepared to have. So, yeah, you can search by country. And one day we'll have a sweet map just like Shodan so you guys can visit it. Sorry, again, that was really mean. You can also search by products, right? So, yeah. You see, for example, there's going to be a lot of Apache products over here, right? There's a lot of versions of Apache Web Server, Apache Coyote, and you can just see what people are running. So there's lots of ways to slice and dice this data, right? Lots of ways to look at it. And you can access this now, just not on my network. But yeah, let's see. What
else did I want to show you? There may be something. There may be not. Sensiveness of NSC script. You guys are reading my notes. Countries. Mention that we might have a cool-ass map because that's something that's really important to mention. Product version. Oh yeah, Drax. We can show some open services. Does anybody know what Drax are? Okay, yes. For what is it? Oh, really? Yeah, we could. I don't know if there's an NSC script that is searching for that currently, though. But services, right? So VNC is another one that I wanted to show you. And yeah, there's a couple of baffling ones. But VNC, well, I'll show you a couple actually. Number one, MongoDB. Why would somebody expose that to the internet? That's just a really bad thing to
have exposed to the internet. Usually does not have authentication on it, just by default, which is great. No sequel, right? And somebody called it like, oh, no sequel, no authentication. And I thought it was a funny joke because I just have a bad sense of humor. But yeah, so MongoDB is there. Also wanted to show you like... Yeah, VNC was the last thing I wanted to show you. So VNC over here, and we can see everything that has VNC ports open to it. If you're not familiar with VNC, it is a remote administration protocol that does everything in plain text, transmitting all your information, even transmitting username and password. It's just plain text. Yeah. Which one? Oh yeah, you'll find all
kinds of stuff. Yes, that is a law firm. Yeah. Angela Choi is law. It's not Angela Cho is law, but it's close. I'm sorry. Yeah, and we can just see everything that we want about it, right? So here's its IP address. Here are the services that are open on this server. You'll see that it's... I can't even... I don't even know how you open that many services. Their sysadmin must be like, I don't know, like a five-year-old. No offense to five-year-olds. Oh, man, this is so many ports open. It could be...
It could be a honeypot, well first of all. It could also be that there is a firewall in the way that is sending back packets that are just replies just because. Just because that's what it does. But for a law firm, like a small law firm, I'm not sure if that would be the case. But is somebody chanting honeypot? So lots of different results that you get over here. Another one is HTTP methods. A lot of times there's unsafe HTTP methods like trace that are enabled. Other stuff that will get you debug information on an application and its server. So yeah, lots of ways to look at this information. Very glad that I was able to get internet because that would have sucked. But anyway,
you... Oh, last thing. Yeah, last one I can show you now with internet.
No. Sorry, somebody said, I certainly hope those are patched. Why? We also have a question from the internet. Why do product menus stop at 8? Why do product menus stop at 8? Wait, it's still going. It's here. Or were they? Well, that answers the question again. Yeah, that's a really tough question to answer. By looking at things? Yeah. Oh yeah, you can use the autocomplete. So like if I were to start typing Apache or something like that, but that's still also not limited to eight. You can scroll a lot. Oh, stop it. H. Oh, that's a bug. That's probably a bug. Yeah, we'll look into that. Last thing I wanted to show you was Heartbleed. So the second one down, I think is probably
the best one. Vulnerable, right? So it tells you the state of it. It's vulnerable. We're all probably vaguely familiar with Heartbleed, but it was pretty much ubiquitous vulnerability for a long time with OpenSSL. It allowed you to steal a bunch of memory from stuff. Most of the times that information was sensitive. There was a lot of times passwords and things that had been passed, even when properly encrypted and everything. So, you know, that sucks.
And last thing I wanted to tell you... That wasn't it. Yeah, you can follow me @hyperion. That's H-Y-P-3-R-I-0-N because I'm a dork. @hyperiongray is my company. We also have a company Instagram. I think there's like five pictures on it. But it's still really awesome. Those five pictures are great. we we have a blog that we do post to again a lot this is about exploit writing now courses that have period dot com I'm there's a free three-hour web app breaking like how do you break a web app course on there on so you can go there totally free there's no ads or anything like that it's just hosted on there the videos up there the slides are up there on everything that you
need to take that course And www.hyperiongrey.com, obviously go there and look at it. You can look at that one really hard, just, you know, Punk Spider, like, take it easy, you know? But yeah, that's everything, I believe. So I guess now is question time. Oh, I was like, no questions. I saw you there. Yeah. So currently we don't but I think that's definitely the next feature because there's so many great ways to like use this data and if you can't do that programmatically that's a huge problem. Actually no no I shouldn't say that. Yes yes the answer is yes actually we do that already but it's undocumented. So you kind of have to reverse engineer what Ajax call is being sent and then use burp suite
or something like that, or firebug or whatever. But essentially, yes, API access does exist. It would be awesome if somebody wanted to write that documentation, just saying. We really don't feel like doing that. By the way, when I say we, it's me and this guy named Tomas who is in Hyperion Gray as well. And he's awesome. So just shout out to Tomas. Anyway, yeah. So using this programmatically is a great use case. There's a lot of, It's a lot of data in here. There's going to be a lot more soon. We work on this in kind of stages because we work, work, work, work, work, and then run out of money. Or work, work, work, work, work, get banned from wherever we are, have to move absolutely
everything. So it's very much worked on in stages. Right now it's in a little bit of a quiet stage. So anyway. Yeah. Yeah. Any of the statistics people that are here? There was two. I know there was two here. Yeah, at least two. Yeah, yeah, it would be, yeah. Yeah, like a report or something on X number of vulnerabilities and Y or whatever, you know, percentage of vulnerabilities by X, Y, Z, you know, all that stuff. That would be awesome. Okay, there was one, there's another question. Yes, yes, sir. Yeah. She is, absolutely. Yeah. So what's your back end like? How long does it take you to get bounced? Do you expect to run this against the whole
universe once just because you could? And if so, how long do you think that will take? Yeah, so I actually did run it against the whole universe once when it was called Punk Spider. Those results were from a whole wide internet scan. So that's all those four billion domains. I did a much more surface scan than I normally would. So I was only crawling like 10 deep, 10 pages deep or something like that. But we did and do have that information still. So if that's something that you want to use, we have that vulnerability information. We provide it to any researchers that want it or just email us and ask for it. So far that's been zero, right? Is that zero? Yeah. No, it's close. Okay, yeah.
So it's like, I don't know, a handful. But we do provide that data. So if you want to do analysis on it, that would be great. And the amount of time that it took was actually four days because I went on AWS, spun up like biggest ass servers I could, accepted that like the bill was going to be 800 bucks. which it was, it was like 8:30 or something like that. Built this massive Hadoop cluster and just scanned the shit out of the internet. And that was the information in Punk Spider. In punk.sh we're trying to really do a push, so that was only with MassWeb, which is the web app scanner. So, yeah, with punk.sh we're looking to really push to get more types
of information. So, NSC scripts from mmap, ports that are open, services, doing service detection, all that kind of stuff. We want to provide that to you, right? So, yeah, I guess really long answer to a simple question, but the answer is about four days, cost about $830. We've used a bunch of big ass servers and it was awesome.
Right. They're very unpredictable. What's that? Oh, yeah. We're doing something like, it was close to 200 NSE scripts on everything. So someday we may have to decrease that. We may have to, I don't know, we should probably ask for donations at some point because we just keep running out of funding for it. Or if somebody wants to fund it, like, by all means. Yes, sir.
Yeah, so the question was is there an order to the sites that are being scanned? Is that correct? Not really, no. We did, well I should say no, yeah no. But we did do a scan of Alexa's top 1 million before we did anything. So it currently has Alexa's top 1 million plus a bunch of other websites that were just crawling from, like I said, Common Crawl.
Nmap might be I don't believe that not not specifically I should say We are not we're not scanning ipv6 ranges or anything like that which would be great I think it's about time somebody started doing that But I mean you know adoption of ipv6 hasn't been as hasn't gone as smoothly as everybody would like I think but yeah, and yeah in this country, right and I think it's about time somebody started doing IPv6, so maybe we can do that. But yeah, good question. So very, very quickly, probably within one second or so, once it finishes scanning, it just sends the information to Kafka, it's pulled from Kafka. And then it's indexed to Cloud Search and
RDS at the same time. And all of that takes, yeah, maybe a second, probably, yeah, a second because of networking stuff. So yeah, all right. - We have one more question from the internet. It's asking to explain more about what you expect to do with the results. - Well, this talk is called Break Everything, right? It's not that I wanna break everything. I actually kind of want to defend a lot of things, right? So most of these sites are just, I don't know, the sites that I look at and really think about are kind of the mom and pop like store shops that, you know, just don't even know to care about security, right? So we made this
as simple as possible originally. It was meant for kind of just anybody to use. You can just pop in. I've shown you all this complicated stuff, which this is more meant for pen testers to use, people that want to do reconnaissance, and yeah, that kind of stuff. you know, and mass look if, uh, a company is there. Obviously once we have more results, that's going to be more helpful. Originally. Um, it was more like, like I was mentioning the mom and pop stores. I wanted them to be able to search their URL, um, just right up here. And, just see if they had any vulnerabilities. And if they did, then have whoever's developing the website fix it, right? That was kind of the original intention of
it. That morphed over to punk.sh because really we found that adoption of that was not as good as we wanted. We definitely did get a lot of inquiries from, like I said, mom and pop kind of shops, and we helped them work through some vulnerabilities. But in the end, it's just we found that the security community is really the one that cares about security stuff, which maybe that should have been obvious, but whatever. Yeah, we built a browser plugin. So it would actually plug in and it would search a site for you and see if it was on punk.sh. And that way you knew that if it was, like if it had SQL injection, then yo, don't give your credit card to the site, right?
So yeah, that was an interesting use case as well. So yeah, any other questions? Oh, what's up? What's the feeling hacky? Oh, feeling hacky. Yeah, actually, you know, Tomas put that in there, and it's kind of like feeling lucky in Google. Let me see something there. There we go. I think it just gives you the top result, like,
Nothing too exciting, I'm sorry. It sounds really exciting, you know, but no, it's really just a silly little thing we added in there because, you know, Google. Yeah, any other? Oh, they're giving me the kill switch sign. Yeah, no, thank you for listening. It's great.
Okay, speaker gift. It's our first year we've ever done coins. Since you're a geek, I know you can read binary, so I'm not even going to tell you what it says. It's got Latin on the back, so if you can't speak Latin, ask your wife. I'm sure she might. We also had these made this year. These are made locally, so the glass and everything with the B-Sides logo. Up next, testing endpoint security solutions with Atomic Red Team with Adam Mathis from Red Canary at 3:10 PM. - And you're live. - All right. Hey, how's everybody doing today? Doing good? Good? Thumbs up across the board, that's good. It's usually like nap time for a lot of people, so glad everybody's still awake. So, thank
you guys for coming out. Today we're gonna be talking about testing security solutions with the Atomic Red team. It's an open source project we started about a year ago. You know, a little bit less than that, I guess. But we'll get into that in a little bit. So who am I? Not super important. There's a few of my favorite things. I have been doing security things for a decade plus, really focused on security role for like the last eight-ish years. Doing a lot of different stuff, vulnerability management and assessments, penetration testing, security engineering, architecture, all that good stuff. Right now I work for Red Canary. I'm an incident handler with them, so we do consulting, we do obviously incident handling.
But they were nice enough to let me steal their slide deck, so you'll see the bird in the corner. Alright, so what are we going to talk about today? So first of all, we're going to talk about testing, why it's important, and some of the common roadblocks to organizations being able to test effectively. And then once I talk about how bad the testing situation is today, we'll get into some more fun stuff, a proposal for how to do it better, give you some tools so you can go home and start doing this immediately. So let's look at the current state of the SOC. Obviously this is kind of a generalization, so your organization might be much better or much worse, but this is kind of, having worked with a
lot of different organizations of different maturities, different technology stacks, this is kind of what we see consistently. So you have things. So you have security tools, you have your SIM, your IDS, your IPS, you have your AV, you have your next-gen AV, you have your next-next-next-gen AV. You have all this stuff. Then you have your controls. So these are like your hardened systems. These are the things you put in place so that users can't be users. And then your policies. So things on paper people are not supposed to do and mostly they do anyway. So you have those things and you expect them to do things. You expect them to prevent evil and to assemble
your incident response team like Voltron. And if APT gets past your impenetrable fortress, you want it to be evicted, nuked, you want their families to suffer. So that's what you expect from your stuff. But truthfully, a lot of organizations have significant investments in their security programs and don't actually know if they work. Let's talk about how we got here. How do we build these big programs that we're not sure about? We build things because it's good security. It's best practices. You saw there was a gap, you went out and you filled it. You made sure that you were taking care of all the different places people could get in. If there was a piece of tooling you needed to get some additional visibility, you went out and
tried to get it. Sometimes compliance. Sometimes, you know, some governing body says, hey, you have to have this thing, even though you don't think it's particularly effective. You have to have it, and so therefore, you know, if you'd like to know that it's working the way it's supposed to. And expectations. So I'm sure a lot of you have had a manager or some see the level that it went to RSA and came back and said, we have to have browser isolation or next gen all the things. And you're like, oh, gosh, okay. So... Because Gartner told them to. You're expected to have these things. I've talked to organizations and it's like one guy, he's the
director of everything, and he's like, "Yeah, I'm really thinking that I'm going to go for a sim next." Really? Because you're doing operations and incident response and security operations. You're doing all the things. You don't have time for any of that. So how are you going to build, maintain, and actually get value out of a sim? but that's what he's supposed to get. That's the next checkbox for him. People accumulate things they can't actually use. It happens, man. I've been there. Now that we have all of these things, we have the best prevention, the best detection, the best visibility, the best workflows, All of our teams are like really fine-tuned machines working at maximum efficiency and capacity. Right?
Everybody feels that way about the stuff they have? So let's see how you feel. If you think about the organization that you work with, work for, the security that you have, do a true assessment on efficacy. So you've got a lot of security. Are you confident that it's doing the things it's supposed to do? Maybe you believe everything is working the way that it's intended to work. You hope everything is working the way it's intended to work, but as everybody knows, hope is a feeling, not a strategy. You cannot just hope that all of your security is working because you will get breached. So how do you know it's working? You test it. If you
want to know if something's working, if you want to know if a car was going to start up, you turn the key, right? So you test it and see if it's going to work. You have to know that your stuff works. If you are not testing your controls, rest assured someone else is. You may or may not know it. They're doing it for free, and they're going to give you a bad day one day. So how do we test? We know testing is important, so how do we go about doing it? There's a couple of different ways that testing happens today. If you're going out and you're looking at some new piece of technology, and you ask the vendor, "How do I know that you're doing the thing
you're selling me?" Of course, they're going to say, "Hey, no worries. We'll help you test this." So mostly, their tests are going to play to their strengths. And sometimes they're outright rigged. If you ever hear of a vendor who doesn't let you test their software in an open forum, like if only approved vendors can do tests on their product, run from that. I worked for a company and we were testing an email malware solution and the guy came in, the SE came in, and he was like, "Yeah, this thing's so awesome, so awesome." I was like, "Okay, cool, we need to test this." He's like, "I got you. Give me an email address, We're going
to send you malware to that email address, and you can see how good the tool is. OK, buddy. I'm glad that your tool can catch things one for one in your curated list of malware. That's great. But that doesn't necessarily mean that it's going to catch everything. So that's not necessarily a good way to do it. You could try to do your own, build your own test plans. But it's kind of hard to get started. How do you know what are the right things to test? How do I take a piece of security tooling or take a control and test it and see is it working the way it's supposed to? Is it actually doing
the thing? And then how do you scope and track your progress? If you wanted to test AV, are you just going to go download all the malware? Where do you stop? How do you know how many samples to test before you know you've tested it all? So a common approach that people use for testing is to hire an external red team. There's problems with this. Usually it's a compliance checkbox for a lot of organizations. So if somebody at the top knows that they have to have a piece of paper that says this is a penetration test so that they pass their audits. That's usually not a great way to do it because the engagement is usually all about, "Okay, hey, just come in, do some stuff, give me
a report, and then you're good." A lot of times the boots on the ground guys don't even get to talk to you, give them some requirements, "Hey, test this, test this." They just kind of end up with what they get at the end. Not all teams are created equal. John Strand has a term called "pin test puppy mills." organizations that will come in and they will scan your environment with Nessus. They will put it on a very nice Word document and that is it. I'm sure some of you have probably seen that. And if you haven't, I hope you don't. So, I mean, there's some great teams out there, some great guys like, you know,
obviously, you know, Larus is here, you know, the Mandians, the SpectreOps, the Black Hills, all those. There's a lot of really good organizations that are out there, but they are hard to get. You have to usually schedule them far in advance. It's hard to get them to come in on a regular basis. And also, it's expensive. You probably can't justify the cost of bringing in a top tier red team over and over again like you would need to to get good, consistent iterative testing. Sometimes there's scoping problems, especially in those compliance checkbox situations, because basically they say, "Come in and hack us." Somebody who's not really in the know. "Come in here, just assess all my things." And so they come in, they go for a domain admin,
they probably get it, but they're just going to take the shortest route to get there. They're not necessarily going to test all of your controls. And there's long trip times. You may have a test year one and then they give you some findings and you go out there and you make sure that's all covered. And then maybe you have a recertification test. Maybe they come back in a couple months and make sure it's actually secure. Usually you have to wait till the next year. If you're one of those organizations fortunate enough to have quarterly pen tests, then that's a little bit quicker but still not fast enough. So what's the solution? Do we build our own red team that can do this all the time? That's a
great idea. I mean, so penetr