← All talks

Tom Maddock - Splunking AD - BSides San Diego 2017

BSides San Diego34:49106 viewsPublished 2017-01Watch on YouTube ↗
Mentioned in this talk
About this talk
Tom Maddock - Splunking AD - BSides San Diego 2017
Show transcript [en]

Hello, everyone. I'm Tom Maddock. And I am here to speak to you about app spunking or app redirected up. This is my first talk. So

I just want to make it clear this is not a sales pitch for spunk. I hear spunk is used in a lot of environments already. So it's already a good tool to kind of share knowledge about. But you can use other tools. You don't have to be spunked necessarily. But I don't have any secrets to share with you. So a little bit about me. I'm a cyber security engineer at Qualcomm. In IT for the last 10 years, the last three I've been focusing on information security. I've been in big Windows nerd. At any given time, I probably have three or four PowerShell Windows can do something. There's a big network of PowerShell and . And despite the fact that I think I'm a realist of new to

the security industry, I managed to get myself a knowledge of Microsoft for a vulnerability. MS-1468, the one that got you . That was fun. So, I have a lot of active directory experience. And the point of this talk is that you all probably have active directory in your environment. So, you have multiple domains. But are you making sure that all those domain controllers are logging the right thing they're supposed to be logging? Are you collecting those logs somewhere centrally so you can go search them and find bad stuff? And like, you know, of the bad stuff, are you finding all the really neat things you really could be finding in any logs? Because it's a rich information source, if perhaps it's cryptic and noisy. So

if you answer those questions, then I hope that you'll find some things in the software useful. So this is the state of the problem. This is what you get by default on Windows on a domain controller. This is the Logging Mechanistic app. It's Windows Event Viewer. It's not great. Especially, this is from a test domain that has five machines in it. And it's our job logging one event a minute or one event a second. So it's pretty noisy. like you can't really search for much in here and on top of that of the logs that you can find they're very cryptic as to what they mean like what credential validation versus logon versus . What does this all mean? you have to kind of get your decoder

and you have to think for what you're looking at so if you want to try and make sense but you have to do a few things the first is very important you have to put yourself in the mindset of developers of Microsoft are probably working on Appetiretory, working with a lot. They're very into the protocols, the authentication protocols that Appetiretory is using, like Kerberos and NTLF. And all the logs are written from the perspective of someone who knows the protocols very well. So if you can learn those protocols and all the details and exactly what steps in each of those protocol exchanges that then gets created, that helps tremendously. The second thing is don't use Eventry. They're centralized by the logs in one tool or one

place. Spunk, like I said, is a tool of choice for this talk. But it's not the only tool. There's L-Stack. There's other free and commercial things. So none of this is . The demo will be specific, but the theory is not necessarily. And the last thing, which is really important, is once you've got a lot of stuff set up, you definitely have to play with it. So you log into your machine, and you see some logs. But you log into something else. Log into an app that uses LDAP, going to AD. Log into that and see what that looks like. I wouldn't say go log into everything because you want to not spread your connection on too much, but log into things

you trust and see what it looks like in logs. You might see something different you weren't expecting and then try to figure out why it looks different. What is this particular one to go research on? There's actually a great website, Ultimate Windows Security, that has at least, if not a full description of all the events you could possibly see with these security events. And some details, comments and whatnot about people's experiences with . It's a good starting point. So, and what do you get from all this? Like, I said there's a lot of good stuff. What kind of good stuff? Well, all this. This is what you potentially find in the lady logs. There's the simple stuff like password-through corset, which we

saw in the screenshot. You can also find network reconnaissance, like people running blood now. If you're not familiar with the plug-hound, it's a neat tool developed by Will Schroeder and others. It's a PowerShell script essentially. PowerShell front end with a Node.js database back end that does basically scans your entire network. You find all the things that you need in order to figure out how you can get to the domain ad and eventually want to steal. It's a really neat tool but it's also very noisy and everything that it does shows up in the AP log. So if you're logging with that, you can detect it quickly. You can also spot lateral movements. people using credentialed pop

in one machine for another for another. And then specific text, which actually the demos at the end here, DC sync is one. If you're not familiar with that, that's the technique. It's actually been around since the inception of Active Directory, but it's been codified. It's embodied in a function of mini cats.

Where basically you use a domain admin credential and you connect your domain controller. And you say, hi, I'm a new domain controller in town. Please replicate all the hatches to me so I have a full set. And it just goes ahead and sends you the hatches. You don't have to log on to the main controller anymore and bring your malware and try and dump the hatches and possibly set off a bunch of warrants. It's really good for attackers. And that's something that shows up very strongly in the Active Directory box. And there's other techniques like golden ticket. I put golden ticket and pass the ticket on there because actually, in some ways, they come up

the same. In a normal log in scenario, the patterns that you see with the stuff, they're pretty well defined. And there's not too much variation outside of it. But when someone's using a golden ticket or pass the ticket attack, it goes outside of those patterns. And it takes a little more analytics, but if you're being very careful about it, you can detect it. Or at least come up with the top ten list of things to go in this . Sometimes it will be a little noisy, but definitely most of the normal stuff drops out. And then past the hash and slower ticket, I put those on there and there's asterisk next to them because in a large environment, at least past the

hash, you can't detect that solely from having to drug your logs. There's usually enough corner cases and weird stuff going on in any environment. Unless it's probably small, like say 100 users. You probably can't detect from just the eight logs alone. You need some help. You need to collect logs from the memory machines that are made. but you can still get pretty far. And then silver ticket, I just put on there because it's something you definitely need the 80 logs for as well as member machine logs. But member machine logs aren't the focus of this talk. Just a few completely have that idea with that. So going back to like, so what do you need

to understand the aftertip logs? First you have a pretty, at least a basic understanding of Kerberos and NTLAD, two protocols to accurately communicate user counts. So I feel kind of just really this, I went to respond with PowerPoint animations here.

So in a typical Kerbo scenario, you have a client that wants to access some sort of resource on a server. And so that's the goal here. So the first thing the client has to do is, using a password, it creates a request for a ticket granting ticket and sends it to the server.

So that whenever that happens, the domain flow logs and event IP 4768, the Kerberos authentication ticket was requested. And from that you get two very important pieces of information. First you get the IP address of the client and then you also get the username of the user account that was used. So that you can use those for later analysis analysis. And so assuming that's successful, So the IPC returns a ticket-grant-to-ticket to the client and the client stores that in memory and in theory it can throw away the password though in practice we know the Windows didn't always do that it's getting better about that now so that's basically the state after the ticket-grant-to-ticket request is

successfully so next it still can't talk to the server because it doesn't have all the needs of the service ticket to do that so the client has to make another request for the DME controller it sends the ticket-grant-to-ticket up And then there's a different log that gets logged. It's a vent-duke 4769, the Kerberos service ticket was requested. And then assuming that all checks out, that the DC sends back a service ticket to the client. But you get three pieces of information here. You get not only the same two as last time, the IP address to the client and the username. But you also get the name of the server the client is trying to access

because that was included in the service ticket request. So just like before the ticket comes back, it's stored in memory. And the client has what it needs now to use the service ticket. to connect to the server, it sends it across to the server. The server is able to verify the ticket is in fact genuine because it has a ticket with the domain controller which is another topic we've already done.

And so, and then the ticket is encrypted. Anyway, the server checks it out, if it's successful, the server logs a 4624. I'm just including that there for completeness and also to point out that DC does not log in it. So, a 4624 event is only when of Kerberos transactions occurred and the final destination of the server, the thing that has the file share, has been accessed and that allowed them to be successful. DC doesn't see if the service ticket was actually used or not. All it did was give it to the client and it assumes the client is going to use it, but who knows if the client can restart or let it work with students either. So that was Kerberos. There were three basic steps.

Get TGT, get the service ticket, and then use the service ticket. NTLM is the other protocol, which is older than Kerberos. It's quite a bit old and it's the namesake of the hash we own It works a little differently though and it's important to understand the difference between these two other curverts So like before it starts with a password but instead of a password being used to generate a service ticket, I'm sorry, a ticket granted ticket it just becomes a hash stored in the client's memory Then when the client has to reach out to the server to access whatever the share it's trying to get to initiates a, in this case would be an SMB connection, it does a handshake and it does a challenge response in

which the hash and memory is used to generate a correct response to send the server to try and prove its identity. The server, however, doesn't have the ability to validate that response itself. It doesn't have the access to find some hash. That's stored on the domain controller. So the server has to send that up to the domain controller. And when that happens, we get two events. 4776, the domain controller is tenuous and also 8004, auto-entail-un-authentication for this domain controller. So the 4776 one,

probably already get because that's on by default. The 8004 one is actually not on by default. It will be added to Windows 7, 2, and later. So you have to actually go through a number of extra steps to turn that one on. And it's crucial to understand what's going on in the environment because unfortunately the 4776 event only includes the information about the name of the client and the user whose hash was being used. You have no idea which server is being accessed. Even though the DC has a direct connection to the server, it wasn't even logging in at this information to say, oh, by the way, I got this request from the server. So

in logs, you can tell. 8004, unfortunately, has that bit of information. So the two together kind of form a complete picture of what's going on. Anyway, to complete this, if the DC is able to validate the response to the challenge correct, based on the hash that it has the user in its own directory, that sends back the OK, and the server considers that it's a successful login. So that's the same event as last time, 4624, you can have a successful logon. It happens both in the Kerberos case and the NTLM case, but only when the authentication is all complete. I would say it's Kerberos and NTLM are a means to an end, and that logon

successful 4624, that is the end itself. Is there any way in the 4624 to tell if it's an NTLM or Kerberos? Absolutely. Thank you for being with it. In the log it says authentication package int 11 or authentication package per browser. It even tells you which version int 11, version one or version two. And also it gets about the key length for int 11. So yes, you can tell the server notes. All right, so

quick recap here. For a two events, 4768 and 4769. In both of those you get a username and an IP address but in 4769 you also get the name of the service that the client is trying to access. And in Thailand you get 4776 and 8004.

You get the username and the service name but very, this is important going up, you get the client name. So like just the name of client there, like whatever that machine has to be called. You don't get its IP address. And furthermore, and this is actually really important to remember, excuse me, That's entirely on the honor system. When the client connects to the server and says, hey, I want access to this resource, in the, you know, authentication is a little bit for the client to identify itself. It's just an open field to put whatever it wants. I don't know particularly of any malware that is abusing that, but it's only possible to just put whatever

name you want. And then there's no, there's no real way to authenticate that. The only way you would be able to figure out who is actually on the internet of that connection is to collect the 4624 event from the server that has access. So that's, So, okay, in the 4624, that is the successful authentication app that's all finished. And that is only logged on the machine that was accessed. Now, I also want to point out that sometimes you see 4624 on the DC. Why is that? Well, it turns out the DC is also a server that has resources that clients want to access, like your policy. When Windows box is downloading your policy, it actually just makes a SIPs connection, or it's

S&P connection of the main controller. Also, if it wants to do any LDAP searches. It authenticates to the domain control directly. So the domain control is actually both the DC and the server in that picture. It's weird, but that's how it is. And it works perfectly. So that was NTLim and Kerberos. But now that we know roughly maybe what we're expecting to get out of the logs, how do we get the logs? And how do we make sure that DCs are logging out of the logs, first of all?

that comes out of the box isn't bad, but there's things you need to add to it. First of all, you need to increase the size of the security model of the main control, because it's like a couple hundred megs or something like that. I think maybe 64 megs in fact. You want at least two gigs, because that thing will roll over quickly. I mean, it's a defense to be gone in a heartbeat. So, you definitely need to do that. Also, you need to enable the extra NTRN log that logs in the 8004 that has that important piece of information about the server that's accessed. And then there's some other features, one of which is called

the special groups . It's a feature that Microsoft added, I think also in Windows 7. Where anytime a logon event occurs, the 4624 event, the machine that received that will double, like, it will look for any specific SIDs in the token that gets generated there. So, like SID, I'm, I'm, I'm jumping ahead of you, sorry. So, for example, you're a domain administrator. Your token has a specific SID for domain administrator in it. And any time you pass that token over, like that Kerbo's ticket, or in the instance case, it's also a token that gets generated on the server site. Any time you pass that token, it has that SID embedded in it. That's actually an important SID in the golden ticket attack. You can

manipulate that SID to be whatever you want. And so, if you enable special groups out of them, a machine that sees that SID will just log in immediately to the event in a different event. That's very useful for detecting golden ticket attack when someone is trying to inject SIS and detect the tickets where they belong. So I actually have a repository that has a sample GQ that has all this stuff turned on. So I encourage you all to go check that out in the other talk. One of the things you can't, there's not a few more things to add. There's auditing entries on the domain group objects. that you need to add, which will watch

for DCSync traffic. Basically, any time, anything, regardless of what's in the domain controller or someone impersonating a domain controller, any time anything tries to replicate hashes, if you have these three added. So replicating directory changes, replicating directory changes, all replicating directory changes that have been set, that will capture that activity of logs. And so, now, actually, here goes .

That's not great if you have PCs logging and you understand what the logs are and I have to get them somewhere so you can look at them centrally and do good analysis. For Splunk at least they make it very easy where you can just install this agent on the domain controller to the point that the Splunk server says send the logs there so that part that part works pretty well. The problem is though that I mean this is sort of like in any big environment you're going to have lots of logs and Splunk charges by the way. You know, there's a lot of data in there and it can get expensive. And one of the

neat things you can do, at least with Splunk, is that you can apply some of these set expressions to the log before they enter the Splunk server and trim out all the extra junk that exists in the event log that you don't really want to pay for. An example is in event 4624, there's like three paragraphs of explanation in each event without all the different fields. But you don't want that. You're paying for that. That's not useful. So just trip that out and then in the repository that I have, those configs, sample configs that do that for you. Also, another thing that makes your efforts more worthwhile is to try and put what's called field

extraction in Splunk. So, by default Splunk tries to just look at broad text and try and pick out things that look like key value pairs and make them feel so you can like group by them and do analysis by them. You can add your own. And so, what I've done is I've actually gone through and enumerated all the possible events in the security log and written field extraction for them automatically. just by using the replacement strings that the app has. And so that's all in the repository URL. And then so finally, to put a picture on Splunk when you install the Splunk support for Active Directory app. This is just a free world app that's on the Splunk app

site. It lets you query Active Directory via LDAP. So you can, well, I use this in the demo to enumerate all the domain controllers and enumerate all the doing admins once you have it all in small business is what it looks like I'll admit right now it isn't very useful it's just like there's events I can see those in the menu but you have this awesome search bar where you can just run arbitrary queries and then see what you can find so go to light

So I have a . So actually, we're going to answer the question. So in Kerberos authentication, do you get the IP address or the name of the client that was authenticated?

So

you can see it's enumerating all the objects in the domain and then it's going to find all the computer accounts and it's going to go try to connect with those and see if you can see if you can connect with those computers. It's very hard. getting a lot of good info. This is not the whole Blackhound set by the way. Blackhound has a whole graph database side to it, which I'm not using. I'm just using PowerShell and gesture to just run all the scans.

So here's a Splunk search that will detect things like Blackhound.

7.69. Trying to find machines that have requested service tickets or would enter the authentication to multiple machines within a short period of time. It's really actually, it can be annoying. But the thing about Bloodhound is it talks to everything. So it is a very strong signal even if it talks. So right now you can see that. I don't know if that's very much a problem. Let me try to figure that bigger. So we have user Bob, just a few minutes ago, made 14 total service ticket request or request to all all six and that's the only thing that has happened in the last 15 minutes that match that and there's other stuff happening there's always background activity so like what you might do

is write an alert that's you know set an arbitrary threshold you take the size of your environment you know divided by 10 say anyone talks to more than 10% of machines that's probably something worth looking into and that could be a bloodhound

On another machine here, I have a domain admin login in Dallas. And I'm going to use new cats to desync down the hash of the cryptid.

So this is just standard new cats.

I'm just using lsa.dc-sync slash user tool in . So we're just dcsync. Connecting one of the DCs and pull down the hash. So now let's go back to school.

So this is a search that looks for dcsync. And it's actually very reliable, I would say. It's using a different .

little depth of animations. But it's looking for specifically an access of this property on the domain root object. That property translates to the extended write of the replicate all record changes. And this object type here is the domain root object. So it's looking for anyone doing that, and then it's searching back and logs to try and find the logon event, 4.024, that correlates with that particular activity. Yes.

Thank you for asking. These are not. These are universal. It's very easy to actually make GUID specific to a specific directory. But no, they are not.

They're actually well defined GUIDs. If you search, my basic approach to that is if I find a GUID, I Google it. And if I find anything, it must be meaningful. And actually, if you Google these, they show up in all sorts of Microsoft documentation. So that's how you know you found .

So this is Alice logging, this is 4624 event on the main controller because in order to DC sync, you have to log in, like the server is the domain controller in that case. So it is receiving the 4624 event. And we have the IP address of the workstation, Alice is on. Like this looks like a normal logon event. You wouldn't be able to tell just from looking at it that this was a DC sync. But because it's related to another event that's not shown here, that was a DC sync. It's the two together. And the way we're, so. I should also point out how DCsync is happening all the time, but only between actual domain controllers. So you want to ignore those. And there's a piece here on the

search where we're getting the actual list of domain controllers, and then actually looking up their IP addresses, and then just omitting them from the results. So that's how you make sure you're not using for actual replication.

OK. So now we have the Curve TGT hash. Let's do something interesting. I'm at the same time as my . We can use as well to do a golden ticket attack. So I'm using Kerberos colon colon golden. I'm going to impersonate a user Charlie, who actually does exist in the domain. I'm passing the third TTT hash. And I have to provide the arguments slash domain colon demo dot lab, which is the name of this domain. And then the sit of the name. And I'm also adding the slash PTT flag, which makes Mimicats inject the ticket it creates directly into the sessions I'm using now. so it didn't so now and so by default minicats will go and add domain administrator SID to the ticket that it

creates for me so if you go back to the Kerberos diagram that blue ticket we just created one and that one says I'm a domain editor so I'm gonna go use that and so a simple way to use it is just try and remember a C dollar sign share on a domain control if you can do that you're definitely a domain editor so I'm gonna typing dir, whack, whack, dc1.demo.lab, backslash, c. All right, cool. I was able to do that. Let's go see what that looks like. There's actually two things that we can find here. One very high signal one, and one that's a little noisy. So, you know that special groups that I was talking about? It's event like 4964. We didn't go over that in

the animations. It's configured to log any time it sees this SID appear in a token. SID ending in 544 is the well known administrators group for the domain. So all it's doing is looking for people logging in with that SID in their token. And then we're excluding again the DCs because they end up getting that. With this not DC West and . And then we're also excluding the real domain admins. This macro here, get admin, just runs in LLAP search to return domain admins. It gets the name, and we exclude them from the results.

So this is an interesting event. So Charlie, who is not a domain admin, was assigned special groups of . So that right there, that's . Now, in a large environment, you could potentially get some noise when like new domain controllers are coming up and down or new domain admins get added. Like, the synchronization between what you're getting the logs and what you're getting from LDAP may not be perfect. So there's always like in the edges there, it's about some possibility for noise. But otherwise, this is a very strong signal. definitely someone who's no one with tickets and the green ticket doesn't actually exist the other thing we can see and this is a little noisier but still I

think it's worth looking at so if you go back to the Kerberos animation and remember the client first asks for a TTT gets that then asks for a service ticket gets that coming from the same client presumably the same IP owners what happened well what would you think is the problem if you only saw the service ticket request from that client but you never saw a TTT request from that client Anyone want to take a guess about the height indicate? They already have the golden ticket. Yes, . That's exactly what I guess. They got a TTT some other way. So now that can be a little noisy because clients can change IP after. If you're on a wireless network and you're going to get a different address from time

to time. But fortunately, Kerberos is chatty enough that it sort of averages out. There's not that much noise, in fact. So all we're looking for here is

We're looking for either of the 4,768 or 4,769 events. We're getting all those and excluding domain controllers. .

So, all we're doing is getting both of the 4768 and 4769 events and combining them to find out which IP address is made one but not the other. And so here we have right here Charlie, only saw 4769 event come from him, from this IP address, he never saw 4768. So something's really up with that. That's worth investigating. So, that's all the events I have. But I think that hopefully it was a good one.

Question about NTLM. So IP addresses is not available in the logs. What's your recommended way of investigating those events and getting IP address? Well, in general, you have to rely on some data sources that can give you a mapping between IP address. The DHP logs could be useful, things like that. There actually is a potential, and this is, I would file this under further research needed. But I think there's a potential, in fact, to use the 80 log to try and build the table like that. Like, so the 4624 event on a domain control. Like I said, the main, Clients are reaching out to the domain all the time to check the policy and make other queries. In fact, one

of the first things that a client likes to do as soon as it gets on the domain is talk to a domain controller and make a 4624 event appear on there. So that gives you, and that's happening as a computer's own account, not the user. No one has to be logged in for this to happen. So on the domain controller, there's a lot of events, a 4624, associating a client name, sorry, a client name, host name, with an IP address. And you can start to build a table from that. I will actually say this about Spunk, it's difficult to do that because it's not great at, I mean, it's great at something but one of

the things like a massive relational query like that, like show me, I have an IP address, show me the earliest 4624 event that has an IP address in the hosting event. At scale across all your logs, that's still a difficult problem. I think there's some more work to be done there. But yes, I hope that answers your question. You have to basically get it from other data sources and try and create a little table or do something like that.

Are any of your queries available? Yes. So actually, the last couple of them are . So there's a couple of . So one small gap that we just deployed in the main controllers, in order to get the logs. Another app that parses and . I just added . Where did you set? Thank you.

slash . And then there's other places you can find other things that are mentioned here. Any more questions?

All right, well, thank you very much.