Lean Threat Intelligence

Name: Lean Threat Intelligence
Uploaded: 2017-01-15
Duration: 49 min 37 s
Description: Lennart Koopmann explores how to build effective threat intelligence practices using log aggregation tools like Graylog, drawing parallels between DevOps culture and security operations. The talk demonstrates self-service data enrichment, correlation techniques, and practical incident investigation

BSides San Diego · 201749:37112 viewsPublished 2017-01Watch on YouTube ↗

Speakers

Lennart Koopmann

Tags

CategoryTechnical

TopicDetection Engineering Threat Intel

StyleTalk

Mentioned in this talk

Tools used

Graylog Splunk

Platforms

Elastic Stack

Protocols

LDAP

About this talk

Lennart Koopmann explores how to build effective threat intelligence practices using log aggregation tools like Graylog, drawing parallels between DevOps culture and security operations. The talk demonstrates self-service data enrichment, correlation techniques, and practical incident investigation workflows that empower security analysts without bottlenecking log management infrastructure.

Show original YouTube description

Using Graylog (or other tools) for your Security Threat Intel.

Show transcript [en]

Thank you all for coming today. Um I wanted to tell you a little bit about something that I call lean threat intelligence. Um I have to first of all I have to say that this is something that I came up with um together with a user of um of great I'm going to tell you more about that later. Um that Zach Allen of the Fastley um security team in San Francisco of the CDN over there. So he originally termed or he came up with a term of this and we kind of worked on this together and we evolved this idea a little. So this is not all only all only my idea but it comes from um working

with computers and the internet and security and programming and architecting stuff like that. It feels like after a few years of doing that, it kind of all came together a little and I felt comfortable doing something in a um uh uh in the room of a or in the context of security conference now because my background is not necessarily cyber security or security or incident response or anything around that. My background is more than I was a software engineer and a software architect. So I come from a DevOps world. Um I started writing PHP ages ago in Germany. um they did some Ruby and I started working on Greylock and more and more people started using it for security use cases.

So then I kind of had to get into the whole security space. So I guess many of you will know more about certain security use cases than I do, but I've been talking to a lot of um users and customers and people in the um security space and I've seen a lot of things. So, I'm trying to get this all together, which is also I'm planning a pretty long Q&A session today. Um, because maybe there's even a way to get a um to get a discussion about this going. Um what what struck me was that when I started um as a software engineer, this whole idea of you probably heard about it, DevOps um really just started to come

along, which is there's developers and there's operations people and they kind of it feels like they're working all against each other uh against each other all the time, which means developers write the software and the operations people have to run it in production and u they complain that developers complain that operations is not fast enough and they can't get their hardware fast enough and all these processes of making control software actually runs and um the um IT officers write this stuff but don't think about how to deploy it and configure and how to run it in production. So this DevOps idea came together of it's kind of both worlds. So you write the software play you deploy

it yourself with help from IT operations but you're more involved to understand what each other's problems are. And when I came deeper into the security space I felt like there's similar problems. there is uh security analysts to look at data that maybe other people collect. And um I was um I was in the office of a user and um I saw people lined up at the desk of the poor guy over there who was running the log management system which I think was um was Splunk back then um and were asking for hey can you put this new rule in and hey I need to parse this hey can you install this app for me uh

and I felt very reminded of the problems that I saw being a developer having to work with IT operations. So that was when I started thinking about there must be a better way that um we can maybe have a database layer of all security data that is a little more self-service where um if you need something from that database layer and you have access to it you can just go in you can subscribe to data many different ways there's going to be APIs um there's a lot of runtime configuration that you can do in the web interface of this system without having to require restarts uh which obviously comes with a lot of problems of you have

to have the permissions and has to if you make a change that can't affect other parts of the system. So this is when we started thinking about this more um and um I usually start a talk with asking who of you um is doing any kind of log management right now which means you're centralizing your logs in your organization somewhere and who of you is using Splunk for that and who's using something like the EL stack something from elastic for example okay anybody who has heard about before maybe even a user in this room and I'm just doing that to figure out how I'm going to talk to you. Okay. Uh because I will focus the talk today. I

will show you a bunch of stuff in Brailo. Um it's very important that you can do that with Splunk. You can do that with the elkstack. You can do that with other tools too. I'm just going to show it to you in Brailock. Um because this is obviously the tool I know the most about because I started it back in the day. Um then I also started a company behind it. But I don't want to make this a vendor talk. I want to talk about the idea of lean thread intelligence. Great examples, but I'm sure you can build this with other tools too. So I feel like the traditional solution for this is that you have a system for

example seam system um that is detecting threats mostly based on real-time data and also usually based on streaming data. So it's looking at shorter time periods trying to build correlations. Um it's doing a lot of normalization. Um but it feels like it's not very flexible. It's very monolithic. Um and it usually comes with a standard set of logic based rules from the vendor. Um and then here with some tools at least there's better examples and not so good examples. It can get pretty hard to write your own rules. You'll have these weird UIs or weird um weird weird config languages behind it. And I've seen a lot that people just stick with the rules that are in there. just let it run and

hope. Um, depending on how good your security organization is, some people just make a checkbox on it and say, "Yep, see, we got it." Um, they also tend to be pretty expensive. Um, and um, they also, my biggest problem, that was the reason I started Rayog back in the day. Um, they tend to come with a really weird software license that can get in the way. Um, so I don't know if uh if you if you're using a seam or if you using a a log management product like Splunk for example that is restricting your use based on how much data you're writing into it per day. Um I think we've all seen projects at organizations

where you um uh where a manager goes around tells everyone to log less because we're approaching daily limit again. Um which obviously is the exact opposite of what you want to do. Um so that can be a big problem in practice too. And if you want to log more for a month for example or if you are in a uh any kind of e-commerce setting and uh Thanksgiving is coming up it can be pretty hard to get like a license increase just for a month or something because then your manager got to call another manager who calls this blank sales guy and it takes forever. Um so we also don't want to have that I think but

that's kind of the traditional solution that we um that we see a lot out there. Um and I feel for this lean threat intelligence idea. Um there's a few requirements. Um when I say lean threat intelligence, this mixture of this process philosoph philosophy and also this idea of having a database layer that is very self-service and provides all the data in your organization that's related to security. Um what you need is something that is able to process large amounts of data from a variety of sources. And I see that this is usually the biggest problem already. Um, which is how do I get this Cisco firewall lock, the Cisco router locks that look like SIS lock, but it's

really not sis lock. So all of the tools out there fail when they try to pause it, except if you have the the Cisco app, which applies some other parser on it. Um, or you have um you have other vendors out there that will that will just send you the most bizarre XMLbased whatever event locks for example. How do I cast that? Now I got to normalize it. Um, you have to have a solution that allows you to do that very quickly. Um, you have to have a very good ACL system, so access control list system that allows you to give users different permissions maybe based on roles, based on the elder group that they are in. Um, because this idea

really lives and dies with how good are your permissions behind it because there is going to be sooner or later there's going to be some data in it that a search team is not allowed to see. Um, I come from a background of working in Germany for so long. Um there's such strong data protection laws that for example I was working as a software architect. I was not allowed to see certain logs until the last days of working at that company. Uh because they might include email addresses or IP addresses and that is under German um data protection law. It's not allowed because I could trace what a user did and who that user sent an email to can

see the content but I would know that there was some metadata based um communication going on. So um that is something that you will hit sooner or later especially if you are um uh in in uh in a world that has any um financial data in their credit card numbers. Um you've probably all seen it. Um it has to have a real-time detection of threats which is I think the classic for example an IDS system that looks at data and looks at this data stream the moment something happens or maybe a few minutes after it happens in an instant correlation can tell you something just happened something's going on. you want to look at it. Um but it also has to

have the um the the batch side of an of of analysis. Um for example, if you um if you if you imagine that uh for example, recently when the FBI and DHS came out with this Grizzly step um report, we've had immediately we have a bunch of users who ran searches in Greylock over the last five months of network data to see if they got anything from these indicators of compromise and see if anything happened there. Turns out it was mostly tor nodes and VPN. So they all had something but um that that is something that you want to do that's not real time. So you need to have a copy of the data in an enriched um and

um already parsed and structured format some which you will make it available for search. Um you want to have some kind of enrichment of the data because if you deal with only IP addresses all day long that has no context like which the who is information of this IP address and which country is coming or which organization is owning that IP address at the moment. Um if you're working in a cloud environment for example, you want to have an enrichment of this is the IP address and um the system saw that it's an it's an internal IP address and did an automated look up in the Amazon APIs for example the AWS APIs to figure out this is an EC2

instance. This is its description and it's that security group for example. If you can enrich data like this, you'll have extremely powerful searches and very very good way and very easy way of doing incident response and doing um threat hunting too. um has to have alerting obviously and very very important has to have powerful and very open APIs integrations. Um this is and this is Brand and that goes back to Zach from from Fastly. They wrote a Python wrapper for the great API for example where they can run their own searches uh do a lot of searches and thread hunting and stuff like that on the um on the command line that is because they don't

great is written in Java and they don't do Java or they don't like it, they don't want to write it. Um so just use the rest API and um uh but you on Python script or Ruby or PHP or whatever you prefer. Um for the architecture um like I said software licensing cannot be in the way. I think that's always going to the moment software licensing is in the way that is going to go up and up and up and up in the um in your organization it just takes longer and longer and longer and the conversations get more and more awkward. Um and you I don't think that can ever be in the way. So you want to

have a very open licensing which is why what I'm going to show today is going to be based and I'm going to have a demo right after this. uh it's going to be based on um on open source software pure there's other integrations with proprietary software but I can only I can only highlight how important it is that software licensing is not in the way um which is I think one of the reasons that tools like ray lock or elk what the elastic stacks what it's called now are so popular because people feel liberated after they got it running and they are not in this daily limit of of of log data anymore um has to offer like

I said stream and batch processing for real time and agg aggregate analysis um must scale very easily. um the nicest or the best system doesn't help you at all if it's not able to process half a million of messages per second for example um if you have that that many messages but even if you have 5,000 10,000 20,000 if it's a pain to scale that thing it's just not fun to run and again the operations people who have to run that stuff are going to hate you um it has to be able to enrich the data like I said um good APIs and permission management um provide the data in a program programming language agnostic

way so you don't have to write your own kind of weird internal system of rules that a vendor came up with but something you can integrate with whatever you feel comfortable with. So probably Python um and um must be able to act on events automatically. That means when something happens that you detected, send an email out, send a Slack message, page your duty. Um maybe even trigger a script and do something automatically. Something that you're so sure about is a bad thing that just happened. Um trigger an IP tables rule somewhere on your host. Should be possible. Um and it has to be able to mix together um open source intelligence like all the open source

threat intelligence feeds out there together with your proprietary intelligence because the I think the open source intelligent um um data feeds out there um like from from spamhouse and from from uh Facebook thread connect and alien vault open thread exchange and all that stuff. I think it's great indicators but the moment you start using them it's more it's really just an indicator. there's going to be a lot of false positives and most importantly it's not going to catch anything that's really targeted on your infrastructure. So you want to combine that with proprietary intel. So if you for example you've seen suspicious IP addresses before where you thought it's kind of weird um you want to keep an eye on

them. So you want to mix that up with your own lookup tables for example. Um so the goal is and that's what I'm going to show you next. The goal is self-service database layer for all security operations across the whole organization. And the solution diagram is actually pretty simple. I think you got all of your This you have all of your um log data from network hardware applications, operating systems, cloud APIs, all the stuff that you're looking at that might have a security relevant signal in it. Send that through a load balancer or an aggregator. We see a lot of people have um for example a lock stash or fluent D um or an NX lock which is a wonderful

tool in the Windows world um sitting there combining all these logs then writing them into a message cube or directly into log manager. I'm going to show you a message cube in front of it because that adds another level of flexibility. Um on this message cube you have very very dynamic subscribers and other backends. That means your data is not going to be stuck in your log manager like Splunk L um log RHM whatever you want to use but it's going to be available in a real-time screen stream through this message queue. So if you want to hook something else into it like for example write some stuff enriched into a Hadoop or write it on

file or store it in Amazon Glacier or S3 or something like that you have this very very flexible point that you can book it. Um, and in this case, the solution that we see and that I feel most comfortable with is using Kafka as a message cube. Can be complicated to run. Um, but if you play around with that, maybe leave the Kafka part out in the middle. You got to run Zookeeper and it's not the easiest system to run and maintain. Um, and it's definitely going to require some reading to get that running. But if you have it, you can have um this one. Um you can have um an extremely flexible system where you for

example have all of your normal log data um your uh syslog data from your network hardware from your operating systems windows event box all of that stuff goes directly to Kafka but you can also send stuff from snort from OSC from other um proprietary tools out there that are doing some kind of um analysis on the data already and then send aggregated alerts into a system like in this case Ryok um so for example if you have snort in OSAC or other IDS logs um going into a system like Raidog, you can um you can you can have this as kind of a high level start if you want to do thread hunting or just a general analysis of

what's going on and then later on do analysis and um correlate that data with your um with your net flow logs for example and see okay what exactly happened here. You want to get all of this into one big bucket um and have it all together ready for correlation. Um one specific thing here about Kafka if you know how Kafka works. Kaka has a um has a um uh has a concept called um partitions. And a partition is like on a classic message shoe like AMGP for example, a partition is like a topic that you can subscribe to. So you would um you could um you would write all of your data onto a topic called um raw

logs or something had no enrichment at all yet. And if you want to subscribe to that, you can get a live stream of all your logs and stream that on your um uh uh um on your on your terminal for example, do real time analysis with a grap and a pipe and do all that kind of stuff. Uh you can write it into a do all these integrations. Um but what we see what people do is they then write from Kafka into DRO. Grad does a bunch of enrichment of the data like the stuff that I mentioned with um look at the IP address and enrich it with information about the AWS entity that sent this or

do a um a LD lookup internally for example or some some directory lookup internally to see okay this was my my internal IP address and this has the host name my SQL database one and it is in this configuration group and it's in this data center do all these kinds of enrichments is something that you can do in RO or with a hook in cla um and then write it back from Greylock onto another Kafka topic and call it the enriched messages. So now if someone wants to subscribe to something um in Kafka, you can go in and say do I want the enriched messages that have all the data on it already or do I want the um the raw

messages have not been parsed yet? Basically just the bites that came out of the network level from um uh uh from your uh from your other log sources. Um so with this combination we even have people who write it back onto Kafka do another analysis and then write it back into brain again. Um that is stuff with the output routing that you can build. So you're extremely flexible and people from the outside from the outside um can hook into that at any time and get build any kind of state of data that they want. And I am actually going to show you a ray setup in this case um that is set up like that that's it's running in

the cloud. It's running on just so you understand what data it's seeing. Um it is collecting the AWS net flow data. Um, but it looks very similar to your if it was a if it was a switch or firewall somewhere in your network. Uh, it looks just like that. It's going to give you high level information of this IP address through this network interface connected to this IP address through this source port to this destination port and it was accepted, rejected, was TCP or basically very high level information. Um, and it's going to give you that um, together with enriched information. So, it's going to do all kinds of threat intelligence lookups um to see if this is a known bad actor.

It's a good indicator. Um you shouldn't rely on it, but it's a good indicator, I think. Um and it's also going to do who is lookups and go IP translation of every IP address that it sees in there. So, you'll see very very enriched messages in there. It's also getting OS sec locks of a honeypot that we're running in a very isolated AWS environment that is just is a completely open um uh security group and Amazon is running a bunch of fake services and just sending all of the OS sec into this thing is going because the IP address is out there for so long and that service is out there for so long now um that

service is getting hammered with automated attacks. Um so that's generating a lot of data that you can correlate later on. Um and it's also I think it's getting the access point locks of my um access point at home through an SSH tunnel into that too just to have another data source engine. Um and the first thing I always start logged out to show you that you cannot do anything in grey um and everything you see is a complete open source grey lock stuff. So if you want to play around with this um you can go to greylock.org just download it and play around with it. Just opa image you just want to spin it up should be fairly

easy. Um it comes it always comes with security enabled. We have the philosophy of um uh HTTPS SSL encryption of u messages that are being sent into it. um encryption at rest um um LDUP integrations all that stuff should really always be in the free working because we don't want to have a management decision or or or make a management decision possible where someone says no I don't want to pay for this I don't want to pay for uh for the security feature so just run it just it's it's going to be fine right it's going to be fine we don't need encryption so this is always always always um uh open source and um uh in it

from the beginning um I'm logging in with an admin user Um and um that's why you're going to see everything in grreylock. If you have another user permission, you will just not see certain links. Um if you have a user, for example, that's not allowed to start and stop message inputs, that's just not going to be in the interface. And the rest API call that's behind it will just return an unauthorized. So you can build this all very dynamically. Now I sign in and this is actually running. It says local host. Um but that's just an SSH tunnel to the cloud. So this goes through the internet. And like I pray for the demo gods today that I'm that this is

actually holding up. Um and I usually start with the system area because this is showing you how all the stuff hangs together and it's showing you this first idea of everything should be self-service and never require a restart. Everything you do in greylock specifically is um when you set it up, you have a a configuration file greylock conf um which is a standard text file that has the base configuration of this is where my elastic search cluster is and this is my um this is my cluster name and this is the IP address I'm going to listen on some base configuration that requires a restart from there on everything is um is done during runtime. So

um for example the notes overview you will see that we have one great server running in this cluster. Um it's not really a cluster actually we got one great server running here. Um it's you see that it's currently not getting any messages in this stuff here on the top. This is simply because we're reading flow log data directly from Amazon and that tends to push you just 25,000 messages in a second and then it waits 30 seconds until it sends you more. So you'll see this jump up from time to time. Um that's simply how the um how the Amazon APIs work. Um and you see that you got a memory overview here. If you click on the details, you'll see

that we got all kinds of um uh all kinds of internal metrics here. This has been built because what we as a company do is we're selling support contracts for this. This has been built from the beginning to be very very easy to run and maintain yourself because I actually don't want the customers to call me. Uh so the cool thing is because that is open source, you get the same thing, right? Um so it's very very easy to debug. I think it's giving you a lot of um internal information and for example if you had a slow rule summary you would see that the process buffer is going to fill up. Um and you'll see that messages

get stuck there and you can look into what exactly is going on there. Um and to show you the self-service approach here, you see that we got um uh we got this inputs area here where for example, imagine that you want to um that you want to collect syslock through TCP from your um from your firewalls. For example, you would just select an input that you want to start the input type. So you say I want to send I want to receive syslock messages through TCP. Say launch new input. Decide on which greylock node it's going to run. Give it a title. um give it a port to listen on, the mind address, every pretty pretty

standard. Bunch of optional fields here. If you want to do TLS encryption, for example, um of the of the data that goes through um and if I was if I was to press save here, if I was to press this button, that would trigger a rest request from the radon server and it would start an input right away and you would have something on that port your firewalls there. Boom, you're ready. Um, so this is the self-service approach and it's also a programmatic approach because everything you see in the web interface is actually just a REST request to the Gradox server, which means if you want to automate that, if you have a configuration management

system, if you have something that you want to trigger from a script, you can do that because it's just API calls. And because the web interface is just a single page JavaScript app, it has no actual logic in it, I would almost say. Um, everything that I'm going to show you here is in the end just is just a REST API call. like all the analysis, creating streams, deleting streams, getting pivot values that is all just simple REST API calls and every call has a permission attached to it and every user has a set of permissions. So you could even say this user is allowed to search to execute the statistics analysis on a specific subset of data.

You give that permission to a user if you want. Um and it's it's all um you think it's all pretty straightforward here. If you go to authentication for example, you see that we got one um that we got one uh only one user installed here. But if you wanted to for example integrate with LDAP, put all of the LDAP integration in there. You can even do and this is um I personally I hate LDAP and I would say I could configure it if I if I get it wrong about 50 times and I figure out oh this weird U string has to be this way and there's some other configuration that I got wrong. Um you

can actually do a login here. So this is something that you hear a lot from people. It's this tiny little feature. Um but you can set up everything, test the server connection, then do test login and it will say you yes this user would have been logged in with this permission group for example and then save the other settings. So I hope I can save people some pain by um not having to restart the server putting some other configuration into a um into a uh configuration file. If you have any questions by the way um you can also just throw them in while I'm showing something. Um but we'll have um uh a good amount of Q&A time.

Um just quickly going to show you the pipelines. This is a feature that came out in the 2.0 version um which is a very dynamic way of taking log messages that are coming in after they have been separated by streams. I'm going to show you the streams after this actually. But imagine streams are realtime categorization of messages. You can have a stream called firewall messages at Cisco and a stream called AWS flow logs for example or a stream with all your OSF messages or all your DNS queries. Um right DNS queries is something um I forgot to explain this. I'm also collecting all DNS queries that the server that is running Greylock is um maybe is listening directly um um on the

uh uh on the interface and searching for DNS traffic. Um, you can put these collectors wherever you want to get an overview of all the DNS traffic that happens in your network, which means even if malware is using an external DNS server. Um, so it's not in your logs. You can if you put that at the gateway out to the internet, um, you'll get a list of everything that ever went over the wire. Um, which is is a different topic, I think, but don't be confused if you see DNS queries in there. Um and I'm just going to show you the for example the OSAC pipeline because OSAC is sending um uh Seth the common event

format I think um which is good which means all the fields are being passed automatically are being parsed automatically by greylock but um in this in this pipeline I'm enforcing um a unification of the fields because I have decided that in this setup the source IP address is always in a field called srcers _ address for example. Um now OSAC is sending that in just a field called source. Um that makes correlation very complicated. You got to think about if you build a query. So I'm just having I'm having a pipeline step here. I'm having a rule here which is um which is taking every message that comes to the OSX um uh stream checks if the message

has a field called source and then just sets the field source address with the value of the source field and then removes the source field. So we have a we have a unification of this which is very very important especially if you have different teams it's very helpful to have kind of a style guide of saying if you send something into this has a source IP address in it please call the source um or you're getting query hell um and dashboard hell after it especially if people change it later. Um, another thing that's happening here is that we are doing thread intelligence lookups. Um, when we say that field has the field when a message has the field

source address or destination address, um, we run a function called thread intel lookup IP. Take the source IP address that's running in the background is heavily cached. So, the first message that comes in might have a u might cause a little delay, but from there on it's heavily cached. Um, it's doing all of the enrichment, which I'm going to show you in the messages actually in a second. And we're also doing who is lookups um which goes directly to the um Aaron um who is lookups and I thought this never going to work because of quota limits for IP lookups um compared to domain lookups the quotas are actually pretty low. So if you ever want

to do that um if you cache it properly they'll probably not even block you out for for thousands of lookups a second. um at least I was not locked out yet and we have some setup with 25,000 lookups um going in very fast that pretty fine sources and yeah so right now it's pretty fine sources it comes out of a plugin so all the stuff that you're seeing here that's actually plug-in code um and that plugins open sources on GitHub if you want to build your own you can add it there we're working on making that um just a kind of a rule language because these these these thread feeds usually kind of look the same right and there's also I know

there's Um, Palo Alto Networks is working on this called Mindmelt, I think, which is an aggregator. Um, we're working on an integration where you can just say this is the format like all the um commented lines are starting with a hashtag for example, and then it's going to be an IP address and the script behind it for example. Um, you can also do um known to exit node lookups, all this kind of stuff. And I felt like I've built five of those now and it's almost all the same code except the parser. So feels like you might want to unify that and open it up. Especially if you that's what I mean with proprietary intel. If

you have your own lists, maybe it's a CSV lookup table somewhere or MySQL table or something. Um that would be very easy to um okay so much for the um for the for the system part. You see there's there's a bunch more stuff here like you change lock levels on the fly. Um you can uh manage index sets, but I think that's a little out of scope for today. Um so I'm just going to jump right in and show you some searches. Um, I've never run on such a on such a small resolution in a while. I hope because I actually wrote most of the CSS in there. I hope it's not completely breaking. Whoa. Um, so this is a general

layout. If you've ever used a tool that is working with logs, I think you might be very familiar with it already. Um, you have a search query bar here on top where you can enter your search query. You're using the elastic search search query language. So if you use an elk stack and you want to look at this for example, it's going to be just the same query language. Um big difference to Splunk and you could argue if it's better or worse or if there should be way in the middle. Um you cannot do any kind of uh field detection or field transformation during search except the decorators but that's another topic. Um you will not see this very powerful pipe

based um language that you see in Splunk. you have to transform the messages up front and you also have to apply the format of the messages up front. So it's not a schema on week for it's a schema on the right in this case. Um select a time frame, write your query. Um you can save searches. Um on the left here you got some information about your search result. How many messages were found? This is just every message in the last 5 minutes in this case. Um and you'll have this here on the left which is a list of all parsed out structured fields of messages. um and I'll use that um um uh in the next

few minutes do a little bit of analysis. Um but this is basically all fields that are appearing in this result set. So for example, if I click on a message, you'll see that this is um the message is that there was a DNS query um for the Latin America um nick which is a who is lookup that actually happened because I said it's the DNS logs of the greylock server. So it's logging its own um um who is logups in this case. Um but you see that we got this in this this message here is broken down into a bunch of fields. So for example um DNS question type um the response code you see that the requested

um the DNS question. So the host name that was requested for translation did not indicate a threat. So DNS threat indicated set to false. Um if this was a host that is known for uh distributing ransomware or actually displaying a ransom page um that would be set to true for example. Then you could decide, do I want to get a nightly overview of all my users that got affected by ransomware tonight or do I want to get um an immediate alert the moment that happens? This is the difference between doing something in real time or doing something on aggregated data. Um but you have this parsed out here. This happens the moment the message comes in. So you

could act on it immediately within a few milliseconds of this happening in your network. Um and you'll see that these fields here for example um DNS question or DNS question plus class are also here on the left um in the in the sidebar. So all fields in this result set will be in that sidebar which is very helpful for analysis which I'm going to show in a second. Um and you can also build your own tables here if you for example want to have more information. Uh address here we go. Um so if you want to build more information that table you can do that very easily. Um that um then um let me open another field

that has even more enrichment. So let's look at for example and you already see it's all in one big bucket, right? We got OSC alerts, we got DNS query information and we also got if we scroll far enough oh that's my that's my it's actually working. That's my uh access point at home. And we also got searching for some net flow messages. Um cheat here for a second. I'm going to explain you what this is in just a moment. Um here we got net flow information that is enriched with a lot of stuff. So in this case, let me find one that has external communication. This year for example, you'll see that there was a

rejected TCP connection from this IP address on this um source port to this internal IP address on port 23. Of course, um you and then you see that this has been enriched a lot. You see that we have for example um stuff that was not in the original message is for example the destination address entity. This is something that grey looked up in the um AWS API. So you see this is an EC2 host which is the Amazon virtual machines and this is its ID. You see that the AWS type is EC2 and not for example ELB which is a load balancer or RDS which is their uh which is a databases. Um you see the name of that

instance which in this case is our ECS container instance which in turn is the um Amazon Docker service. You see that the destination address thread mutated. You'll see that the destination address is also an internal address because we have a pipeline step that looks at it and sees that 17230 is most likely always going to be an internal address if you follow the RFC for that. Um, and you see that the source address because it's an it's an outside internet address also got translated to a geoloc and you see that it had and this is a bad example. We couldn't translate it to a who is country code or who is organization but maybe we'll find a

better one here. There we go. For example, here is one of our uh internal um IP addresses communicated out on the HTTPS port to a server owned by Google. And you see that here destination address. There you go. Destination address who is information is Google. Um so this is what I mean with having this enriched data. This already gives you a good idea of what's actually happening in a very in a in a very um uh uh in a very uh uh easy to use way I think because if not you have to open another tab look up and do this and do this and do this. Um by having this enriched information you can just

go in and say show me of the last hour go show me um all destination address um country dogs for example quick values is going to build you a pivot table and you see that uh 34.37% of all destination addresses were based in the United States we got some in Germany some in France some ID it's not I don't know um in Great Britain Netherlands um you could go in and say uh show me on a road map where all of those are uh where all of those are [Music]

located all of the source addresses on a um uh um on a uh wall map for example. So with this stuff and having all of this in one big bucket um I just wanted to show you one um one um one real world analysis I would say. Um so imagine you're a security researcher and you see an um IDS alert and you want to figure out what exactly is going on with this IDS alert. Now this alert could come from because it's very high severity. You you could get that as an email or some other way of notification. Um, or you could just see it by going into the stream that's selecting all of the OSAC

logs. Like here, if I click on show screen rules, you'll see that every message that contains the value of OSC posted in that stream. So, if I click on it, you now only have um OSC block messages showed over the last uh 8 hours, for example. You'll see there's a lot of alerts going on because that thing's just out of the wild in the public. Um then let's go in and execute the saved search to only show me the very high or second events which is I search for it. You see that I just executed a query that says show me every message where the field severity is set to very high. This is a completely unhumaned OSX

by the way. I know that a user missed the password more than one time in an open SSH server is not necessarily very high, but um let's imagine that we wanted to see where are all those um very high alerts coming from. What was the source address that caused this um uh that caused this or sec alert? Let's go to source address from the quick values on it. And you'll see that we got a bunch of actually pretty active actors here. You see that um roughly half of all very high alerts were caused by this specific IP address here. So let's go in maybe even look at we could now go in and say show me all

the geol locations of that. Now we see this is where all of the high alerts were coming from and this is always based on the result set. Right? So if you dive deeper into a into a analysis um these these visualizations will change with it and give you a very very um uh easy to follow way of of seeing what's going on. But now let's say we want to investigate um what exactly this IP address did in our network. Okay. So I just click the search button here. That's going to add this to the query. So you don't have to write it yourself. Execute this again. And now you only got all our second words of a very specific source

address. Um and um now you can go in and say actually I want to see every activity in my network. And this is the whole correlation and um uh this is the whole um idea of having everything in one bucket and then do very very easy correlations. By going back to the overall search, we're just going to search over the complete bucket of data. In the last day, we're just entering the IP address here. You'll see that now we got not only the OS but we will also have um oh sorry let's search for easier source atlas equals this you'll see if we go to the source field and run a quick values of that

you'll see that we also got about 3 and a half thousand network connections. So maybe there was stuff that sec capture. So we can go in and say show me all network connections of this source address and you will see that if I go to the destination address now so where did this IP address my network connector you'll see that it actually did connect to one two three four different um to four different hosts and you'll see that most of that however 99.68% 68% went to this specific host. We could now enrich that again and say actually I want to see what exactly that was. And you see it did go against the honeypot and this

is where um this is where um OSF actually caught it. But it also went to it had at least one connection to our bra server, one to a load balancer and one to uh I don't even know what host that is. Some other host um actually nine connections to it. And now you could do analysis of showing the destination ports and get a lot of context to it to then decide if it wor is it worth it for me to go into um for example a TCP dump and look what exactly is this IP address doing in our infrastructure um and um by having enrichment like for example we're working on something where you can have

another enrichment that says is this IP address a known public hoster for example um where we just we collect all of the IP ranges um that are provided by Aaron for example take all of the IP ranges of Amazon of Microsoft Asia of um GoDaddy, DreamHost or all of those. So we can say yes, this is a um this IP address came from a hoster and it's a little weird that someone um does something on our payment APIs acting like a browser or acting like a real user but came from a hoster that might be someone who tries to hide his real destination at a hawk in between for example. So with doing this kind of

analysis you'll get a very very um nice way of doing incident response and of um of uh uh looking at the data and then from there on go deeper what exactly you should do with this thread. Um if you go back to the easy integration I just quickly wanted to show you now we're also coming to an end um two things which is the API browser which might need a moment to load this. There we go. So this is all of the APIs that Red is opening for you and this is only the top level APIs where if I click on one there's going to be several calls in it. So this is how you

can integrate that with everything and really make it a database layer for your organization because with the rich information that I have you don't even have to use the web interface. Maybe you want to use your own script that runs certain certain hunts for you that you can share with co-workers, writes it somewhere else, writes an email, does all kinds of integration. And if you want to play around with it, you can also just execute um execute these calls directly from the browser. It's a this is swagger, by the way. This is a an open source project. Um it's great if you have any kind of API you want to make browser. We didn't build this. Um

and then you can also take that information and build um the uh the fancy dashboards that you can show your boss to show that you got the cyber under control. Um and uh yeah here for example this is another case where uh we we take these bam house I think um ransomware block list um take the the DNS requests and see there's anything that's definitely known with with a high chance to be involved in ransomware for example um this is aggregate but you could also go in and say um trigger something immediately immediately that happens do something about it and then send it to you with the enriched information of um this happened on a workstation and you

already looked up that this workstation is um uh is currently run by a certain username for example. So you know exactly where to now run to and pull the network and um make sure nothing else happens there for example. Yeah, that's my um overview of great. Like I said this is probably going to work with a bunch of other tools out there too. who's most familiar with great um and um yeah it's my first kind of idea and iteration of having a database layer and improve some processes in in organization. So, I'll be I'll be around for the rest of the day and um if you want to talk about a bunch of other stuff, have ideas, um have own

experience with that, I would love to talk and um I think we got a few minutes left for for Q&A if you have any questions. Great talk about this idea about other tools. I've seen a lot of

women's

background if you ever seen the concept of data links. Yeah. Yeah, I think I would go um open this for example. Um I would in that case I would go with a um uh uh with a Kafka architecture because Kafka works great with for example and then you could do this whole thing of you send raw data into Kafka write that to or write enriched data back onto KA and write that into a too. So you've got both things you know with concept of a data lake is you have the raw data so you can later on do other stuff with it that is the flexibility of cover or any other message um that allows you to store raw

data in the do and also be rich data and subscribe to both very very flexible yeah we actually see it a lot that people write stuff into too because for some analysis you're okay with it running for 30 minutes it gives you like a really really good analysis which is something that you can definitely

Yeah exactly. We have multiple search heads and multiple indexers. How scalable is very um I'll answer on the question how scalable is great because multiple search heads sounds like you have a flunk background. I think um so you can have it's it has been built from the beginning to be horizontally scalable. Um, we have a I cannot we have a big US customer that we've all seen in Auburn. They they have a huge environment. They're sending 750,000 messages a second. Uh, and that scales. They only keep it for a week. Um, but they could keep it longer if they bought more content. Uh we're using elastic search as a back end but we use it um we use it

differently than if you were just writing it into into um into elastic search or reading from it again because we have great observer in the middle that is doing that has a message journal sitting in front to make sure you never lose a message into buffer load especially load spikes back to back of caching dashboards work differently um so we use elastic search differently but I think elastic search itself has proven to be fairly scalable so yeah we have um I've seen um setup with about 60 servers um and then hundreds of elastic search servers and I've seen some with just everything on one machine. It's wherever whatever the architecture looks like.

Yeah. I have another question because this comes up a lot of people entering to accept open source. Yeah. Uh what's your argument for open source? Um I think the biggest argument is um that by having an open source I mean there's all the standard open source arguments all right and I've seen this blank talk from their director of uh competitive intelligence or something they are going to the moment you mention open source there's a high chance they're going to send you a bus of consultants in um that are going to scare you about open source and come up with like ' 90s arguments off it's going to be more expensive if you don't do all

of that. Um I think the the biggest benefit is really in this partic in this particular use case for for log management is that by not having a restrictive license on top of it you can log whatever you want. Um and it's much easier to get more hard disk or maybe a few more cores and a little more memory than getting a bigger license for your for your propritor tool for example. Um and then it just comes with all this um for example there's a bunch of stuff in the this thread intelligence lookup plugin that is just coming from the community. So there someone already built the mind melting integration that's just there. You don't have to

wait for us. if you want to build it myself, go build it. Um, and then if people are reluctant because they say, um, no one's supporting it and no one we don't really know and we can get in indemnification and we want to redline a huge contract because that's how we buy internally. Um, you can do that with us. Um, and I I don't want to go too deep into that, but if you if you want to, um, yes, we're going to provide support

for there. No more questions, I'll be around. And um, yeah.

Lean Threat Intelligence

Related talks