← All talks

BSides Las Vegas 2019 D1 - Ground 1234!

BSides Las Vegas · 20197:47:30522 viewsPublished 2019-08Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleTalk
Mentioned in this talk
Tools used
About this talk
BSides 2019 Las Vegas
Show transcript [en]

Now it flicked. It's only one side? Yes. We can try with a PC to see if we can get a . Give me a quick second. I'm getting it back there. It's just not going to there. So maybe if it's not the dangle problem, we can try my dangle because I need that to keep the laptop forward. So let me try. I have to go back too.

Do you have any audio in your presentation? No. Okay. Do you want to get that going and make sure you're in presentation mode? Yeah, let's put it this way and make sure that this works. Perfect. Perfect, yeah. All right. Thank you. Okay. I think... Hi. Albert? Albert. Albert, everything's fine. All connected the way it's supposed to be. Yeah. Good. See? Appreciate you. And some folks might have had a challenge. Thank you for having all of your...

- Yeah, everything's working so far. We just got assigned, so. - Okay. - That's why I'm late, but it looks like-- - It's in sciatic nerve. - 10 minutes, you're going to have to-- - Sure. - I have this office in the back of the room. - It's 50 minutes, right? - Five zero. - Five zero. - So you're going to get your-- - So we'll try to be done at like-- - Should be 40 minutes. - Okay, nice. - We're starting in half hours. - And the two minutes is the last one, right? Hopefully we don't get there. - It's definitely a problem. - You guys have any questions or anything? - No, I think we are ready. So maybe

we should move this. - To here? I don't want my laptop to get-- - Are you guys gonna try to have a question time at the end? - Yeah, so we target for 45 minutes talk, and we are now around 44, 45, so we should have like five minutes for Q&A. - Okay. - Okay, perfect. - All right, thank you. - Thank you. Maybe it's even better to hold it here because if you get away a little bit, then it won't capture your voice. People will scream at you, "Get close to the mic!" well actually i will start speaking so hello i guess that they will just do you want to use a couple Do you want to

use the color one? The color one. This one that gets here. The one that gets . That way you can-- because a lot of times you just turn your head. You explain something here and then it doesn't catch your voice. Yeah, I think we have one slide to present our names and roles. So if you just say Alexander and Alvaro. Alexander and Alvaro. I always mix one because I can't roll my R's. I think Spanish is better.

Hello?

Hello one two three test. Perfect. Perfect. We are all set. All right, thank you. Thank you. This is another way we can go. So if they have these, I think it's better to figure out what they need. I think it's for helping people to understand what would you say well you are now one year ago we can always ask alvaro to to sum it up to sum it up i saw in a private people so they make it public

Good morning and welcome to B-Sides Las Vegas, the Ground 1234 track. We have a few announcements before we start. We have the SSO Wars, the Token Menace by Alexander and Alvaro. I was joking with him that I cannot roll my R's. My wife speaks Spanish and makes fun of me. Before we get started, though, we'd like to thank our sponsors, especially the Inner Circle sponsors, Critical Stack and ValorMail, and our stellar sponsors, Amazon, BlackBerry, and the National Security Agency. It's their support, along with our other sponsors, which allows us to make this... The whole thing go on. Since these talks are being streamed to YouTube, we ask that you please silence your cell phones before we begin

and that you use a mic. If we have time for questions, we should have at the end. Please raise your hand and I'll run the mic to you so that the people listening on YouTube will be able to hear it. If you have feedback about this talk or for the speaker, there's a link on the SCED entry for this talk. And with that, let's get started. Please welcome Alexander and Alvaro. Okay, can you hear me? More or less? Okay. So thank you all for attending our talk. This is the talk in minutes. My name is Alvaro Muñoz. This is Alexander Miros. We are both security researchers with Micro Focus Fortify team. He specializes in dynamic

and runtime. I specialize in static analysis, so a little bit different fields. So this is the agenda for today. We will start with a brief introduction to what authentication tokens are, especially in the context of delegated authentication. Then we will be presenting a couple of vulnerabilities that we found related with authentication tokens, both of them in.NET frameworks and.NET libraries. So the first one is an injection vulnerability that leads to arbitrary constructor invocation. This doesn't seem like very dangerous, but we will see what's the potential impact of this vulnerability. and then we will present an XML signature verification bypass that allow us to basically sign any SAML token and get the server to accept our SAML tokens as valid. So we will see how that affects

several of the Microsoft frameworks such as WCF and WIF, Windows Identity Foundation, and some major Microsoft proxy products such as Exchange and SharePoint. So this brief introduction, this is a diagram representing delegated authentication. It's not supposed to represent anything specific like WS Federation or SAML protocol, it's just a very generic diagram. So basically, we have three different entities. The first one is the user that will try to access protected resource that is served by the service provider. The service provider is actually not handling authentication itself. It's actually delegating authentication to a third party entity, those delegated authentication name. So it will redirect the user to this identity provider and the user will basically present the credentials to the identity provider. It will

verify the credentials and if the credentials are valid, it will handle token to the user that it needs to present at the service provider in order to access the resource. This is basically a summary of delegated authentication. So these tokens that are returned by the identity provider can be in multiple formats, right? Like SAML or JWT, JSON Web Tokens, or simple Web Tokens, or any other, really any other format. Most of them have some similarities and they have these major or important attributes, right? So these are the attributes that really are important to make a token functional. The first one is the issuer. So that's who issued the token. It's not the same if I give you a coin that I issued myself that

if a bank gives you a coin that they issued, right? So we also have the audience that is who this token is for, right? So yeah, again, it's not the same if we create a token for service A, this token is not valid for service B. And if service B accepts a token for service A, then there's a vulnerability and something that we can exploit. We also have the expired date. Obviously, we can use a token after that date. And then we have a set of claims. The claims are basically a set of attributes that describe the user. So this will be used by the service provider. for taking authentication and authorization decisions. But the most important attribute here is the signature. If we don't sign the token,

anyone can change any of the claims, the issuer, the audience, anything in the token, and then get the service provider to accept that token as a valid token. And then the attacker will be able to basically become anyone who wants to be in the service provider. So there are multiple steps that an attacker can think about attacking. But the most important one is for us, or the one that we focused on, is step number six, when the user presents the authentication token to the service provider. The user or the attacker, obviously. So the service provider needs to parse and process the token. and basically it needs to verify all the attributes and also verify the signature. If the signature is not valid, then the authentication token

should be discarded. So we will be focusing in this step because we found two different interesting attack vectors in this step. The first one is what if we can inject any malicious string in any of these attributes that gets parsed before the signature verification. So maybe that can lead us to some kind of injection before the signature is verified itself. The second vector is, well, maybe we can go one step further and actually bypass the whole signature verification process. So if we can do that, it's like the holy grail because we can basically modify any claims in the token, modify the user ID, the user email address, and just authenticate as any arbitrary user. So now we will see these two vulnerabilities that we found in .NET

Framework, and we will start with a token parsing vulnerability. In this case, we can do a good illustration for token parsing vulnerabilities. JSON Web Token or GWT is an internet standard for creating JSON-based access tokens. We can see such token on the screen. It contains three main parts: header, payload and signature. ILG field from the header defines what algorithm should be used for signature verification so it will be used before we know it's valid signature or not valid signature .NET has a couple libraries that can be used for GWT token parsing and we found out that system identity model token GWT library passes this ALG field to crypto config create for name method we noticed that this method doesn't restrict type

names So we are able to invoke arbitrary no argument public constructor. By the way this method can breach not only just GWT tokens. The similar problem is in SAML token as well. For example algorithm attribute in signature method element of SAML token will go to this method without any restriction as well. But you can ask what we can do with this because we are not controlling any data. But actually we can control some data. First of all it's type name itself. We will show a bit later how we can use it. Also please pay attention on the gadget on the screen. It is from .NET framework and in .NET current HTTP context is stored in static property and access to request parameters is done through it.

So if no argument constructor uses this approach it can be very interesting for an attacker. But this ideas with abusing no argument methods may not be very realistic So we decided to take two servers from Microsoft, this SharePoint server and Exchange server and try to explore this problem there and here is our result SharePoint returned us different results if object was created or not We can use it for getting information about installed product and even their versions. Also, we were able to rise unhandled exception that leads to denial of service. Exchange server gave us even more interesting results. As we already mentioned, we are controlling type name and it may be not just simple type name, but assembly qualified name with assembly that we would like

to load this type from this assembly. .NET allows to control developers assembly loading by implementing own custom assembly resolvers and they may contain a lot of vulnerabilities. Also often these custom assembly resolvers are installed by static constructor and usually it is not big problem to invoke them by instantination specific type and we actually can do this by this problem. But this attack is not simple. And sometimes, even if you are able to add our assembly resolver and this assembly resolver has some vulnerability, we may still not be able to perform an attack. For example, if a server has some assembly resolver before the attack and this assembly resolver rises exception for our malicious assembly name, our new assembly

resolver will not be called. Also, most of our gadgets that we found allows us to load libraries from local disks. So attackers still need to find a way how to upload malicious files to the server. Here we can see examples of such assembly resolver gadgets from an Exchange server. On the first snippet, it is installed by static constructor. On the second one we can see assembly result itself. It uses assembly name for path for assembly loading, but no validation at all, so we can use dot dot trick and change current folder to any desired what we want. Let's put all this together. As authenticated user we can invoke arbitrary public no argument constructor for any

DLL file from any folder on the server. Of course, attackers still need to upload this file on the server. And for an uncertificated user, it may be very tough task. But it can be significantly easier if attacker has some account on the target exchange server. So for our demo, we have assumption that attacker is already put this DLL file in Windows temp folder. and let's look how it will be in real Exchange server. First of all, it's our malicious library with our gadget. Here is our type name and two lines of code. We will take CMD query parameter and we will use its value for starting new process. As we told you earlier, this developer is already on Windows

temp folder, so we can start crafting our request. As you can see we are using SAML token. Here is our vulnerable attribute, type name and assembly name is ..trig that will change current folder to Windows tab. And finally we would like to start calculator. Here is the same. First of all let's check in Process Explorer that no calculator yet and we can send this request without any notification and we can see calculator. on the process explorer now let's switch to our holy grail and we'll look how the entire signature verification can be bypassed security assertion markup language or just simul is critical component for many delegated notification and single sign-on scenarios it has xml based format and uses xml signature

for integrity protection By the way, this problem is relevant not only for SAML, but for other protocols and applications that uses XML signature standard. For example, SYNET SOAP or WS Security. On this slide, we can see simplified SAM token. Along with authentication information, it has signature element that should protect it from tampering. This element contains three main parts. Signature itself in signature value, signet info with information how this signature should be verified and the most interesting element for our attack is key info. It represents key that should be used for signature verification. Some protocol is quite old protocol and of course we are not the first ones who decided to look on its security. So we would like to highlight a couple examples of interesting

finding from the past. For example XML signature wrapping discovered in 2012 and allows attacker to change original token but signature remains valid. The second hour example was presented last year by Calby Ludwig and he uses XML comments to change meaning of some token attributes but signature remains the same. Both examples require valid token and will not change signature. Our attack is different. are able to recalculate own signature and create some token with any information what we would like. So we do not need to have valid some token at all. For understanding how it's possible, we should know how .NET implements signature verification. So first of all, we need to obtain key. Using key info we can extract key itself from it or use key reference

to fetch key from some storage. On the second step using this key we will verify signature value. But please note after this step if you have a positive result it only means that this token was signed by this key and it was not changed. In addition to this, we should be sure that it was done by proper signing party. So, we are taking keyinfo element again, try to identify who has signed this token, and of course on the last step we will check it is expected signing party trusted or not. On the first glance it may looks like secure implementation, but please pay attention on these two steps. We are taking keyinfo twice, and we need different type

of results on these steps security key on the first and security token on the second so we need to use different methods for it resolve security key and resolve security token on this slide we can see description from Microsoft documentation because purposes of these methods are different we can expect inconsistent results from them And finally we can see the general idea of our attack. We should craft key info in such a way that both methods will produce different results. One will be used for signature verification and another for un-certification of signing party. In this case we will be able to use own signing as our own key for signature calculation but sellers still identify us as expected signing party. In general such

attack depends on... Do you hear me? So in general such attack depends on implementation of the mentioned methods, but all cases what we checked we were able to get these results. Some cases had additional requirements to the server or environment, we will show a bit later. Other were vulnerable by default. So let's see examples of differences between this method that can be used for attack. First method can support some type of key data file that is not supported by the second one. Both method can process elements in different order or even they can use different subset of these elements. Now let's see how this attack looks on real-world application and frameworks.

So yeah, we will be reviewing some of the most important frameworks in Microsoft that actually take or accept SAML tokens. So the first one will be WCF. We will see that in a few seconds. That is basically used to build web services. Like for example, Exchange Server, and we will show you an example of how we can take any account or compromise any account using this technique. The second one will be Windows Identity Foundation. that is basically used for any application that wants to integrate with an identity provider. So if you want to process authentication tokens and extract the claims, you don't want to do that from your own, like reinvent the wheel, you will

be using Windows Identity Foundation to take care of that process. And last but not least, we will see how some applications like SharePoint actually customize the configuration of WIF and can make it even more vulnerable than in the default configuration. So the first framework is Windows Communication Foundation. As I said, it's used to build web services. It's a Microsoft framework, but it's used to build service-oriented applications, which basically interoperates with other web services and clients developed in other languages and frameworks. It will basically exchange XML documents that are in the form of SOAP messages, but the most important part for us is that it will accept SAML tokens to authenticate the client. So if you are a client, you can present your

credentials either with maybe user and password, but you can also submit a SAML token in order to authenticate yourself. Depending on the version, it may use Windows Identity Foundation to take care of the process of handling the authentication token or not. We will start by focusing on when it's not using WIF, and then we will review when it's using WIF because it's different resolvers and different attack. So the class that is responsible for handling the token, the authentication token, in this case the SAML token, is this SAML assertion token, SAML assertion class here. And it will basically contain this piece of code. So as we can see, we have the two methods that Alex mentioned. The first one is the resolve security key that will

take the whole key info section. So this key identifier variable is the whole key info section. The other method is resolveSecurityToken that takes the same data, and this is user control data. But both of them return different elements, different objects. The first one will be returning the verification key, that is the one that will be used to verify the signature of the SAML token. The second one will be returning the signing token that is used to verify that it was signed by the identity provider, the identity provider that the service trust, not anyone. So the first one, when resolving the security key, this is the code. It basically takes what we call depth first approach. So it will basically iterate through all the different keys in the key info

section. There may be multiple. Normally it's only one, but the standard allows multiple of them, many of them. So it will iterate through all of them, and for each of them, it will call this try resolve security key down here. So this method will basically try a number of different resolvers. We have at least three of them in order to try to extract and resolve the key. So the important thing here is that we are first iterating through the key and then through the resolvers. Now, if we check how the token is resolved, that's what we call breadth-first. And it's going a slightly different approach. It's first iterating through the resolvers. And for each resolver,

it will pass the whole key info section and will basically iterate through all the keys, individual key elements. So basically it's like having two nested loops. In the first one, we will be iterating first for the keys and then resolvers. And in the other one, first resolvers and the second one or the second level will be iterating through the keys. So we will be basically iterating through the same elements, but in different order. And when changing the order, we can actually abuse that in order to bypass the signature process. So now let's imagine we have a token or we can even create one token from scratch and we will sign it with our own symmetric

key. We generate a symmetric key or even an asymmetric key, but for this example, just keep with the symmetric key and we'll sign our own token with our own key. It should not be trusted, right? Now we're going to send both the symmetric key, but also the expected and trusted certificate from the identity provider. We're going to send two elements in this key info section. Now the resolve security key will iterate first through the keys and will take our malicious symmetric key. then it will try all the resolvers for this key. And the two first resolvers will fail, but the third one will succeed. And it will return our symmetric key that will be used to verify the signature. Since we control it, we can bypass it.

Now, the resolve security token down here, we'll basically do a different approach and we'll first iterate through the resolver. So it will take the first resolver, and with that resolver, it will try to resolve the two key elements here. So the first one will fail because it will not trust a symmetric key that it doesn't know, but for the second one it will succeed because it's actually the certificate from the identity provider. So it will return that certificate and that certificate will be used to authenticate the signing party that is the identity provider. So now we are able to bypass the signature and just basically craft any arbitrary tokens. So this is how it looks

in in the XML token, we have this first element that is in this case a symmetric key, but it could be an asymmetric one. And that's the injected key, the one that we will use to resign the document and the one that the server will use to verify the signature. And the second one in green is actually the original trusted certificate from the identity provider. And this one will be the one that the server will use to verify the trust in the identity provider. So let's see how we can abuse that, for example, to take over some accounting exchange web services. So we thought about doing this demo using a real client such as Outlook

or LingQ, but the demo was kind of complex and longer, so we developed our own client for the Exchange WS. But you can even craft your token without intercepting any real token. You can just craft it and sign it and it's still valid. But for this demo, we prefer to intercept a real token and modify it. So this is VIRB. Basically, we will be using VIRB to intercept this client request. And this is the client that we developed. It's very simple. We are just sending a request for user one in the name of user one. And we are basically requesting the mail tips. We could have asked for the mail items themselves, but this is

more simple. So we send that and we intercept the request. And if we send that to a repeater in order to modify it more easy, we can switch to XML view and we will see the whole SAML assertion first in the header, in the SOAP header, and then the body of the SOAP request. So we will be replacing all the instances of user one with administrator. If we don't resign this, if we just basically intercept a request and we just modify it, we will break the signature and this will not be valid, right? So if we send the request like this, we will get an internal server error because it's not able to verify the signature. We just break it, broke it. However, we developed

this plugin for VIRP that we will be releasing and it will allow us to resign it with our own keys. So in the original assertion, you can see that there is only one element in the key info that is the expected identity provider certificate. Now if we click on resign with RSA key, it will basically create a new RSA key pair and then use the private one to resign the whole token. And then in the key info section we will find the one from the identity provider but also our own RSA key. That remember this one will be used for verifying the signature. So now if we send that we get a 200 response which

is successful and we can see that we are able to get in this case the mail tips for the administrator. And we get no errors and the response was successful. So with that, we are able to basically impersonate any user in Exchange Web Services. You mean like Transport Protocol? Transport Security? Doesn't matter because you're crafting your token from the client. And in this case, we are intercepting it. But if you don't want to intercept it, you can craft the same token yourself and just sign it. So interception is just one version of the attack. Also, if you control the client, you can use verb to intercept the traffic for that client, even if it's using SSL. Okay, so Windows

Identity Foundation. It is another example of a framework. Windows Identity Foundation, or just WIF, is a software framework for building identity-aware applications. It is very easy to add a delegated authentication to your application and support authentication tokens from different security tokens services like Active Directory Federation services, Azure Active Directory, Windows Azure Access Control services and others. For parsing some tokens, WIF uses some security token handler and it uses a bit different approach to the key info element that we just saw in previous section. For a security key to verify signature value it will take only the first element of our key info. but for security token for authentication of signing party it will work with

all key identifiers from key info. Also WIF by default configuration uses issure token resolver. This resolver seems to be secure because it has very similar code for both mentioned methods. But in case if this method cannot resolve some key identifier, it will be passed to the next resolver - X.509 certificate store token resolver. And here we have difference in dimensioning method that can be abused. ResolveSecurityKey can work with encrypted key identifier clause, but ResolveSecurityToken doesn't support it. Definitely this difference can be exploited, but here we have a couple problems. To decrypt our symmetric key, server needs to have certificate with private key in some specific certificate storage. By default it's local machine trusted people. Also attacker would need to use public key from this certificate

to encrypt this key. and it is just public key, so in many cases it is not a big problem. But if these requirements are met, we can perform our attack. So we can use symmetric key for signature calculation, after that we can encrypt this key using public key from server certificate and put encrypted key identify clause in the first place. After that we will put our expected certificate. So resolveSecurityKey will return our symmetric key and we will pass signature value verification. But resolveSecurityToken doesn't work with encrypted key identifier clause. So it just skip it and will take the second element from key info element. key info and it represents expected sign-in party. So we will pass on the notification of

sign-in party as well. On this slide we can see examples of key info for this attack. Encrypted symmetric key in cipher data. Internal key info in yellow section represents certificate which public key was used for encryption and the similar to previous case in green section it's expected certificate. We were reviewing default configuration of WIF, but it allows a lot of customization. So let's review some example of customized WIF. - So if you are using WIF to build your application, you can customize it, and one of the major products or Microsoft products that customize WIF is SharePoint Server. So it's a very interesting target for us. and we analyze how SharePoint customize WAF for their own purposes. So it basically use a standard or default configuration but it uses a

default ESWR token resolver, the SP ESWR token resolver. And remember this is using WAF so for security keys we only process the first element in the key info section and for security token resolution we use all of the elements in the key info section. Apart from that, the key resolution supports symmetric and asymmetric keys. They are called internally like intrinsic keys. But for token resolution, it doesn't support them. So we can abuse that in a very similar way to what we did with Exchange by resigning our token or crafting a token from scratch and then signing it and use our own private RSA key to sign it and then send the public key of this RSA key pair along with the original and trusted certificate to the server.

Again, the server will take one for signature verification and take the other one for authentication of the signing party and we will be able to abuse that in order to make the server use our own RSA key for verifying the signature. So this is how it looks in XML. It's very similar to what we saw in the Exchange demo. It's using an RSA key. As you can see here, we are sending the modulus and the exponent for the public key. This is what is an intrinsic key, something that can be recreated out of this data. And then obviously the original and trusted certificate for the signing party. So we thought like, "Jay, now we can abuse and attack SharePoint." But we found

another problem. So this is how the authentication flow works for SharePoint, more or less. If there is any Microsoft engineer in the room, please correct us if we are wrong. But basically what the user is doing is requesting an authentication token from the IDP, from the identity provider. It may be ADFS or Azure Active Directory or any identity provider. And with the authentication token, it will present it to SharePoint. So set point will validate that using the token resolver that I just showed you. The SP is where token resolver, which is vulnerable and we can abuse that and make it trust our signature and then bypass the signature verification. Okay, so far so good. But now, uh, SharePoint doesn't want our SAML token. It will exchange that token

with internal SharePoint STS, that is our security token service, in order to get an internal local session token. And this other service that is implemented as a WCF web service that is using WIF, is using the WIF default resolver, which Alex explained that in order to attack these configurations, we need to get access to the public certificate that is stored in the SharePoint server. This is somehow a strong limitation, and we didn't want to go this way because it's not that funny. So what we did, we found a different flow. Basically, it's not a flow, but a different way that we can abuse to exploit it. So in step number six, you can see that the session token is cached, right? And then out of this session

token, it generates a session cookie that is returned to the user. Now, in order to attack SharePoint using this technique, what we can do is authenticate as a valid user, as the attacker, maybe with a low privilege account, and then get our session cookie. In this process, the session token token for the attacker will be cash and the cash key will be an internal identifier for the attacker, right? Just remember that. Now, what we will be doing is basically crafting a malicious token from scratch that will have some very specific claims. So first of all, the issuer will be SharePoint. It will not be the identity provider. And then we will use the victim data, the victim attributes as the, for example, user principal name,

user ID. So the token will be representing the victim. But we will send the attacker internal ID as the cache key. So now SharePoint will receive that, it will verify the signature, we will bypass that, and then because the issuer is SharePoint, it will not try to exchange the token with the STS, the Security Token Service, and will basically create a session token out of this data. So we are basically creating a session token for the victim, but when storing it in the cache, we will use the attacker cache key, and therefore we will be poisoning the cache and basically replacing the session token for the attacker with the victim one. So now all that the attacker needs to do is refresh the browser,

browse again to SharePoint, and then it will get authenticated as the victim. Now if you do that for administrator account, that's basically a remote code execution. So last demo. So here we are, the attacker is browsing to SharePoint, and is basically going to authenticate with valid credentials. As I said, maybe low privilege or maybe he wants to escalate privileges to get remote code execution. Now, we will just get authenticated as the attacker, as you can see here. So far, nothing special, no attacks, just basically logging in as a user in SharePoint. Now we can go to this endpoint that is an API endpoint to get information about the users in SharePoint. It's public and you can

access that if you have any account. So that returns a piece of XML that, I'm sorry for the font size, but I will zoom in very quickly. So basically if we search for the attacker, we want to get the attacker internal ID. So you can see that the attacker internal ID is this long string with some information about the email address and the identity provider and so on. And now if we search this document for the victim, we will get the victim internal ID. So this is the information we need to craft our malicious token. And then because we can't resign it using our key confusion attack, then it will be accepted. So this is

the code to craft this malicious token. We will be using the victim internal ID, but for the application token cache key, we'll be using the attacker one. And this is what will allow us to replace and poison the attacker session token. So this is the token that it generates. As you can see here in the key info section, we have two key identifiers. The first one is the RSA key value. That is the one that we use to resign it, the one that we control, and then the key identifier that SharePoint trusts. Now if we scroll up a little bit for the SAML claims or attributes, we can see that we have the user ID

and everything points to the victim, right? The user ID, also the user email address, the UPN, but the application token cache key that is a special claim will point to the attacker internal ID. So now if we send this request, SharePoint will basically replace the attacker session token in the cache with the victim one. And now we just have to go to the browser, refresh the session, and then we will get authenticated as the victim. And remember, if you do that for administrator, that will lead to remote code execution. So... oh yeah we are releasing this verb plugin it's not ready yet will be probably released tomorrow after black hat talk um it will be released in this repo that's

for sure and it will basically allow you to intercept some of the assertions, some of the tokens, and just modify them and use either an RSA key to resign it, or if you get access to the public certificate in the case of a WIF application, then you can import it down here and then resign it. So there will be instruction and more information in the repo probably tomorrow. So with that, some few conclusions. Basically, we are not saying that SAML protocol or WS Federation protocols are insecure. It's basically that some implementation has some flaws. In this case, the flow in the .NET implementation is that they are processing the same user control data with different methods, and they are slightly differently processing this data,

right? And we can abuse that in order to make one of the methods process one key and the other one process a different key. Also, we focus on researching .NET libraries and framework because basically it was not vulnerable to any other attacks before, any of the other attacks that Alex presented. and we didn't review any other languages or frameworks. We don't expect this very same flow to be present in other languages because somehow specific to .NET and how it handles keys and tokens in a different way, but there may be similar flows out there in other languages libraries. Also, even in .NET, XML signature, remember this is a vulnerability that affects SAML token verification, but the vulnerability is in the way that XML signature

is processed. And basically, XML signature is used in a lot of other places. We found other cases that we reported to Microsoft, and they patched them, but there may be other cases where XML signature is still insecure. Main takeaway, patch as soon as possible if you have any SharePoints or Exchange servers on premise because obviously the cloud ones are already patched. And with that, if you have any questions, I think we have still some minutes or you can reach us in Twitter. These are our Twitter handles. So thank you very much. - We have a little more than five minutes and let me remind you, please raise your hand and I'll run the mic to you to ask questions so people watching online are able to see it or

hear it, I mean.

Hello. That's just fascinating. That's really excellent work. I'm wondering whether you see the problem in the SAML specifications. Are they vague about how this about how multiple key information is supposed to be processed? Or are the standards clear and the implementations where the problem is? - Yeah, so normally you will always find that key info section contains only one key. That's like 99% of the cases. But actually the standard allows multiple keys, but it doesn't define how they should be treated. And probably one of the use cases is because you may do, and then you may send the actual one but send the next one to be stored by the client or something like that. But I don't think that is documented. It's a legal

case sending multiple keys, actually Microsoft support them, but because it's not very normal, I think that it's not handled or it's not tested very thoughtfully. That's our assumption, thank you. - Hi there, what version and build of Exchange was that demo done in? - Okay, so it was, do you want to answer this one? - Yes, it's Exchange Server 2019. It was before last Tuesday patch because last Tuesday patch solved this problem. I think it's something like in the spring. I cannot tell you exactly building number but it's quite fresh. It was done in the spring, in this spring.

There's another question. By the way, problem is not in Exchange server. We just want to highlight again. Problem in .NET framework libraries. So we do not see any string dependency on the build of Exchange server. What are the best practices to be able to shut down this type of attack? I mean, apparently if you use different, you know, if you can't get the public key for that particular machine, as far as using a different public key for the internal trusted store versus what the web server presents, what's the best way to shut it down other than patching? To be honest we were shocked to see that this key info processes twice. Why we need to do this twice?

Secure way, just take one time, take what you need, token, key and other stuff, or at least verify that they are connected somehow. Here there is no verification that they are connected, the key is from token or token from key. And we believe that Proper implementation should choose this way or check connection between the results or use only one method, one approach for both results. So other than that for patching, other than patching, the other thing that you can do, you said that not make public or not expose the public certificate of those certificates stored in that particular certificate storage. but that's only for Windows Identity Foundation. For WCF, for SharePoint, that doesn't matter. So the only way of real patching the issue is by

applying the latest security patches. Also for SharePoint, for example, you need to upgrade to the latest version of SharePoint, but also get the latest version of the identity model library through NuGet. You better go to the Microsoft page for this CV because, for example, in SharePoint, it's not straightforward. It's not just as simple as upgrade the SharePoint version to the latest version. I'm just curious. It sounds to me like this is all on-prem. Is this a problem on the O365 platform? It used to be. It used to be? Until the Tuesday patch, July Tuesday patch. So the good thing is that because it's Microsoft infrastructure, they already patched that. So online version, cloud version is patched. Now it remains the on-premise leftovers.

If your companies are using on-premise services, those need to be patched. Now along with that though, was it defeated by an O365 by MFA or not? Say that again, sorry? Was that actually preventable by MFA? multi-factor authentication or not uh that's a good question uh so if someone token requires a second factor then yeah it will be prevented by by that because the sample token will be just one of the factors good question thank you we have time for one last question if somebody wants it have you been exploited in the wild at all if it's been around for several months i mean that's especially if it's on 0365 then - Not that we know of. I mean, we don't

have access to Microsoft monitoring systems for example, O365. - So there's no reports of it being exploited, not even any on-prem solutions. Are there any solutions that detect this sort of attack now that it's been implemented in? Guys, it was not publicly available this information. We just two notes about this and reported to Microsoft. Microsoft knew about this for a couple of months and we can understand Microsoft because if you can see description for CVE, the huge of product that should be patched including .NET framework and other stuff. So it was not publicly available this information. It's first time when we present it. We were not exploiting nothing and hope nobody. - We promise. - Yeah, so it was known for Microsoft for a couple months,

but it was known just Microsoft and us. - Is there a way to find out if someone has been exploited with this at all? Would there be any sort of log or any kind of reference in any auditing on any on-prem system? - We have no idea. I mean, we can't be 100% sure that, I don't know, someone in China or in Russia knew about this vulnerability and was exploiting it. As soon as we found it, we reported it to Microsoft and as far as I know in the CBO report, they say that it's not exploited in the wild. So I don't think that they have evidence that it's been exploited in the wild. -

Okay, all right, thank you. - So this is the CV number for the YubiKey confusion. You will find a lot of information in Microsoft bulletin about how to patch it if it's exploited in the wild or not. - Thank you all for coming out, please. Thank Alexander and Alvaro and support our sponsors. I can't remember the Platinum sponsors. Critical Stack and Valimail. Yes, it's called tester. um - It's a repo for the plugin but on the Black Hat site they always make available all the tags and we send them white paper and slides. So slides you already saw. I would like to recommend to download white paper. It contains all code snippets, explanation, - We didn't have permission to attack Microsoft permission

to attack the cloud version. But when we report it-- - We invest a lot of resources there for right now. - We let them know that, for example, in WIF, the impact is bigger in the cloud version than on-premises. There's no restriction for the . No, no. - No, no, no, no. Originally, usually, this token contains the Excel 540 certificate or some number, bring some somehow for the cloud version. RSA, injected RSA key, it was like a . So we can generate RSA for private and public. We can use private for signature and can put RSA key clause is public key. Server needs just public key for public key. - Yes, understood, understood, understood. In

the explanation to this demo, we were sure that it's a semantic. Idea was to show the main attacks. It is a semantic key, it is X540 encrypted kind of thing. - We have these examples in our talk, but when we start to record demo, this plugin doesn't support yet a symmetric key. But we would like to show demos, plugin as well. So we just put RSK, but it is a binary key, it's still possible, but we will not be able to use plugin in this case. So this is the same, yes. Yeah, the same and actually our POC for Microsoft uses binary secret, actually it's semantic, without any encryption. But plugin is not supported yet. I can implement it. I

didn't think it was so easy because I tried to do this for .NET and there are some couple problems. It's all very simple. - For attacker, usually he doesn't care about the second. He just take the, or no, it can be X549 certificate, it can be RS8, it can be whatever. yes the same type yes Yes, I agree with you. But in demo, there should be RSI key and X.5.9 certification.

- We didn't have chance to look deeper on implementation. We tried our test cases and they didn't work. So it's sad that they are fast. And we were busy for preparation and we do not. make some deeper analysis. I hope because they took months for creation and we see that it's a lot of list of these products. So we hope this should be all okay. We have some assumption because it's how developers of the net tries to make some abstraction. So on the second step, when they try to understand if it's good or not good, they make this for such level of seconds, this is some type of authentication. So from authentication, they use security tokens. So they say, okay,

let's take security tokens and theoretically it can be any of them. Security key for signature of code they need they need keys. And because they need this to different results they use different methods but they can produce guys they can produce different results because they know the same code for some. Sometimes it depends on implementation, sometimes it depends on configuration but as we saw it can be by default and it should be like sequential like get the security key and then when resolving the security token you should um get what their security key is returning and then resolve the security token from from that security as i said i mean you could yeah i mean you could really see the two things as

the same If you already implemented the code to extract the keys, you can reuse that and call it in the first place. You can do this by once. You can have two different results. From one resolution. There are many options. I do not think that they implemented. We have to check the fix. I think - They may implement some additional verification on the last step that they are connected. So it looks like patch. It's less risky because we do not know how it's supported because customers can want to support a couple types and couple things. So my guess is it may be done by additional verification that they are connected. I do not know. Maybe they can take on the course. I do not

know. We will look on it, but still. We didn't want to look at the fix before the talk because if the fix was not good enough, then we would not be able to present it. So it was like, okay, better not take a look. No, we did not have time. We didn't have time, and we were not asking from Microsoft to do this. So we trust Microsoft QA section department. They say, OK, we have all the information what we need. We will do it ourselves. OK, guys. No problem. Do you have your mic? Yeah, I already took the other one. OK. My boss texted me during talk. He said, can you call me ASAP? I said, can it wait like 40 minutes?

Nice. - It was interesting, as I said.

I heard you close the door behind you. You have the same computer audio? Yep. Try to plug that in. I can. I'm happy to. Yeah, that one fits a little better. That's hot, too. But no audio. There you go. But your eighth is completely melted. So this is my fourth day at this property? Yeah. It's my fourth day. This property, too. But that means my next year is here. Is this still on? Is this on? Yeah, I think this is on. Yeah. Yeah, near Yocha. Yeah, that's near Yocha. Yeah. Do you have the bio here or no? No, this is just my instructions for like a script for thanking the sponsors and all that kind of shit. You can say in your

show we'll talk. If you want me to read the back. Yeah, yeah, yeah. Appreciate it. I'll be in the back of the room. We can give you a 10-minute warning and a five-minute warning. Can you also take photos? Yeah. Okay, so... Do I need a passcode to put the camera on your phone? No, the only thing you need to do is you need to put it like that. Okay, that's Apple. Yeah, and then just get the camera and then you can take it from the back. Perfect. Thank you so much. What's your name? Brandon. Brandon? All right, nice meeting you. What is Brandon? So how long is the... Is it 50 minutes? Actually, 55 is what it

is. So the 10 minutes you give me is like before the end of the 55. Yeah, so 45 on. OK. Don't worry about it. Thank you so much. Yeah. What is this? It's just like the Wi-Fi thing. OK. Then I guess I use this. I think you should leave it here. It has to be a quick release. It doesn't help on the screen. We're going to record your session because you didn't know that. Do we have the camera on you at all times? And we need you to use the microphone at all times. So welcome to B-Sides Las Vegas, the Ground 1234 track. Today we have Nir Yosha. giving his talk on his quest for privileged identity to own his own domain,

privileged escalation and lateral movement. Nir started his career as a squad leader in the Israeli Intelligence Corps. He helped with gathering intelligence, tracking the growth of terrorist organizations, over the last 15 years of experience in identity management, user behavior, and insider threat analysis. Currently Nir is a principal solutions engineer for Preempt. Nir publishes his posts on LinkedIn and speaks occasionally at security conferences, and we're lucky enough to have him here at B-Sides. Before we get started, though, I do need to thank all of our sponsors, especially our stellar sponsors, Critical Stack and ValorMail, as well as our other sponsors, Microsoft, Silance, and Robinhood. These talks are being recorded and streamed to YouTube, so if you can please silence your cell phones, we would appreciate it. and so

that the people who are watching online are able to hear. If we have time for any questions at the end, if you can raise your hand and I'll run the mic to you to ask your questions. And with that, let's welcome Nir to B-Sides. - Thank you, thank you so much. How is everyone doing? Good afternoon. I'm not going to drop any old days here. I'm not going to reveal any new sophisticated attack vectors. What I will talk about are those good old Active Directory misconfigurations. Do we have any defense guys here? Anyone specifically that deals with Active Directory vulnerabilities? Yeah, so you know what I'm talking about. I work for a company that has an assessment tool for Active Directory. So what

we do is we look at the identities and try to figure out those vulnerabilities and it is still relevant. 2019, many, many years after patches coming out, we still see those vulnerabilities with our customers. There is a lack of visibility on the user's access. There are specific threat detections that are still missing when we look at the past the hash, golden picket, we'll talk about this today. And there's no enforcement. A lot of the stuff is post-mortem. It's after the intrusion. So real story, around three months ago, we were getting a phone call from a client, and he needs our assessment tool because they have this unexplained lockout of accounts. We went into the site and we see an unmanaged device, new device in the

network that is trying to log in into the domain. And we see that there are like huge amount of users, some of which are not even part of the domain, A lot of them are getting those locked out and we're showing it to the client and then we see one user that successfully logged in to that unmanaged device. So we're showing it to the customer and he's not impressed. He cares about the unlock accounts. He wants the unlock accounts, the locked out accounts to unlock them and he's not really, we're trying to explain to him there is might be a relationship between the locked out accounts and that authenticated user. And before we end up this discussion, we see that user getting into the

CEO laptop and magically the customer started to freak out and disconnected this device, disabled the account. But the story is exactly about what I'm going to talk here today. Those misconfigurations in Active Directory. So I tried to structure the talk in a sequence way, the way we see it usually. So I can call it the kill chain of owning your domain. I'm not going to focus on the initial intrusion, or the execution on objects, I'm going to look on a very narrow phase where there's a privilege escalation, lateral movement, internal reconnaissance, and as you can see here, it's a cycle. It's a cycle that keeps on happening until the bad guys get into their targets. So I

have to embarrass myself. This is actually from my first B-side. at Cleveland. And yes, I dressed up in my talk as a detective. That's how weird I am. The idea in my crazy mind was that because I'm talking about threat intelligence and indicators of compromise and investigation, detective will make sense. So that's what my background is. Threat intelligence. Used to be in the Israeli Intelligence Corp. And I don't know how but I guess it worked because I'm now in Las Vegas, right? So someone liked it. Okay, so we are going to skip the initial intrusions. The fact of the matter is that there are many ways to get inside the network. Mainly the portals, the open portals, VPN, OWA, which I don't still

understand why it's still not protected by MFA in most of the companies I'm working with. And Citrix, there's escaping Citrix boxes, talks all over. And so our assumption is that we are already establishing a foothold within the network. And at this point, whether it's a malware-injected into one of the processes or hijacking some high privilege program or bypassing the user privilege, sorry, the UAC, we're getting a local administrator on an endpoint that is probably not our target, not our end goal. And so from a target perspective, at this point, I want to start to look around Just as you do with the external reconnaissance, you want to start to monitor the new environment, live off the land, so to speak, and

find your next hops. Well, Microsoft is making it easy. So anyone is familiar with the GPP, group policy preferences that allows us to set up passwords across the domain? That was introduced in Windows 2008 and we still see it with our customers today using that. Or even if they upgraded their systems to a later patch, they still have those passwords there. They don't change them. Microsoft eventually realized that this is a bad idea to send the same administrator password across the domain. They did it in phases. The first realization was we probably don't want to keep passwords in clear text within the domain controller. Probably a bad idea. So administrators were able to create VBScripts with clear text, and then Microsoft was adding

a patch that will encrypt that with an AES encryption key. But around 2012, Microsoft dropped the ball and published this private key for that encryption in MSDN, public portal. So everyone could go and decrypt that XML sitting on Sysvol, which, by the way, is accessible to any domain user.

So what we see across board is even though most of the clients already upgraded the security patch that solves this problem, a lot of them forget about the file that was sitting in a sysvol folder. And what we do is we can get those files, decrypt them with the published private key, and most of the time the password is still relevant, still being used within the network. So Mitigation. Don't forget to remove all those exposed passwords after patching. Microsoft came up with LAPS. Anyone familiar with LAPS? So this is a solution that randomizes the password across domain, sorry, across local administrators. There are other solutions to do that.

And in many cases, what you can do is you can just disable the user. Not every endpoint requires the local administrator, definitely not the guest user to be there. Other things we see is obviously a lot of endpoints today are being provisioned from an image file. And so the image has a specific local admin passwords and we can detect that the boot timestamp is similar between two machines, there's a chance that the local administrator is the same as well. That make sense? Now, if we're not lucky to get those local administrators password working for us, we will start with internal reconnaissance. So coming in from threat intelligence, It's very similar for the bad guys when they get into

the network to look for intelligence gathering the same way they do it when they are outside the network. There are scanning tools. You can leverage Active Directory, and you can look for other evidence within the network. So we all know the scanning tools. One port that uniquely identifies domain controller is the LDAP Global Catalog. Anyone who remembers which port number is the LDAP? 389. This one is not necessarily going to be Active Directory, it could be any other directory. But this port will help you to find the domain controllers in your network. The other thing I show here is which one do you think is more under the radar flags for Nmap? The top or the bottom one?

the top because when you're using scanning tools that basically killing the session immediately after the starting it, this is very unique within the environment. So as an attacker, one thing to look for is to stay under the radar we're using scanning. But there's an easy way once we identify the domain controller is to query the domain controller using those valid LDAP queries. So this is a tool called Bloodhound, which is much less noisy than Nmap or any other scanning tool because it doesn't interact with the host, it just interacts with the main controller and it is looking for all those entities within the domain. The users, the hosts, the groups, and build this graph that gives you an idea of how the land looks like and where

should you move from there.

So this is an example of Bloodhound demo where there is a target. The target is a development server. And on the top is my source. And Bloodhound finds for me one hop where I can, in this case, elevate privileges from a domain user to administrator user that is part of the domain admins and the target. So that tool is becoming very handy to find my targets. And then there is the typical intelligence gathering. We still see within environments those network diagrams sitting all over. Clear text passwords. And who here thinks that summer 2019 is probably a good password to spray on a network. Anyone here has summer 2019 anywhere in their environment? Yeah, that's... We keep on seeing those passwords and we'll talk

about it because at the end of the day, credential is the main issue when you go to all those attack vectors that we're going to talk about. So when it comes to reconnaissance, you want to make sure your environment monitors all activities to the domain controllers. There is a very specific footprint to most of the tools, Bloodhound included, and you want to make sure you train your users around credentials. And the same thing, you keep your finance documentation, your HR documentations secure. You might want to as well put all your network diagrams secure as well because they can be used by the attackers. Okay, so we established the foothold, we started with some local reconnaissance, and

now we're ready to dump the hashes. So Microsoft is very handy, right? It gives us this process called LSAS. We'll talk about that and we'll talk about NTDS. LSAS probably heard of, is that process that is still today running on Windows 7 and down in kernel memory and can be dumped. Now, the ELSA skips the credentials of not only the current users, but all the users that logged in into the endpoint since the last reboot. And so potentially, you can get some good hashes over there.

We will talk about NTLM and Kerberos. By the way, NTLM is actually NTHash, what Microsoft calls NTHash. LM, the LAN manager, is the old hash that Microsoft is not supporting anymore. But for some kind of reason, people still call it NTLM. A 2016 server... Windows 10 clients, Microsoft introduced Credential Guard. Credential Guard is a way to secure Elsas in an isolated area within memory that nobody can just access like that even if you have or you are in a system space. But you just need one machine in your network that is not 2016 Like anyone here has clients with still 2012 servers or Windows 7? Anyone 2008 Windows NT? Wow. Is it connected to the network or it's

just like on the... That's like history. You can donate it to museums. So yeah, so none of those operating systems can have this feature. It's not ported backwards. So Mr. Benjamin Delpy, the creator of Mimikatz, created even a workaround for Windows 10. So in the case of Mimikatz, usually in Windows 7 or 2012 servers, you would be able to use Mimikatz to dump the hashes. So this is an example of the commands running on Windows 10. And what Mr. Delpy shows here, that you cannot really get the password now when Credential Guard is enabled. Also, Credential Guard will keep the Kerberos keys. So those are the Kerberos keys. Usually, we will get them in Windows 2010.

and 12 servers or Windows 7, as I mentioned. Now, in 10, there's still a way around it. So custom security support provider is the company that communicates the hashes outside from the host to the server. And so if at some point someone is going to authenticate himself, the passwords are not secure anymore. They are in motion. They are secure only when they are at rest. And so what Mr. Delpy is showing here is he just injected a custom SSP, a custom SSP supported by Microsoft. And at this point, it is being monitored and translated into a clear text. So the bottom line here is It's not a matter of whether you're going to get it. It's just whether you're going to dump it

from the memory or wait for those credentials to get out of memory and then catch it while they're on the way. Yeah. Thank you, Mr. Bell.

All right, the next place hashes are located is NTDS, DATTEED. This is basically the database of Active Directory, and it sits on every domain controller. This is a block diagram of the schema, the way it works is it's basically a database, right, based on ESE and JetBlue. And if you get a hold of that, you'll get a hold of the password hashes as well. Now, this is another demo where I can show you where the file is. So it sits under NTDS folder and you cannot just copy the file as is because it is being used by the Kerberos Key Distribution Center, the KDC. However, you can create a shadow copy of the file. And then when you create a shadow copy for

the file, you can take it offline and start cracking it. So this is what I'm doing out here. Actually, I can move it a little bit forward. Now there's another component that needs to be used when you're looking for the hashes. It's called a boot key, and that is sitting in the registry. So what I do is I... export the registry keys as well, which is under the system. And then offline, I can start to crack it. Now, sometimes you will get that message that the file is not in a clean state because we just created a shadow copy. So what you need to do, you need to defrag it, make sure that you fix all the broken pieces.

And eventually you're getting this rain of hashes. That's gold. That's literally gold. Thank you so much. So in that specific case, I used the tool of Michael Grafenther that called get addb account. There are other tools that can be used in order to get those hashes out of NTDS that did.

So mitigation, make sure that hashes are protected in the host by not allowing high privileged users to get there because if they're getting there, they potentially can be dumped. And from an NTDS perspective, make sure you protect not only your domain controllers but also the backup of your DCs. A lot of times those backups are not encrypted, are not really kept secure. safely in a place and most of the time when we find them, we still can get those hashes. So the other way of getting information is by stealing the hashes, which is tricking users to authenticate with you. There's a couple of ways to do that. A lot of them are based on the SMB authentication mechanism. So if you're familiar with

the SMB authentication, the way it works is if I want to access a shared drive, the server is asking me to encrypt a challenge with my password hash. If it's encrypted and verified by the server, they grant me access, otherwise access is denied. So what can I do in order to trick a user to get into my file server? Well, I can add an HTML image tag in either a page or an Outlook HTML form. And as you can see here, it's as simple as pointing to the file server where my either responder or any other tool is waiting to collect the hashes. So there's an already built-in model there in responder, and that will allow you to get those hashes

Another way is to use custom forms in Outlook. Anyone is familiar with that? So if I get a hold of user's credentials, I can create custom form. Custom form is when you get an invite from Outlook or a another email that looks different it's because someone was crafted a specific form and that form is synced across clients so it's going to sync to your outlook client within your network even if it was created on a outlook web access and then that form potentially can also have a powershell payload for example powershell empire so This is how it looks like. You can create those PowerShell payloads, or you can just simply add an OLE object within an email, and then you

will see here the specific session key. This is the SMB authentication that includes the source identity hash. And this could be cracked and eventually being used to move laterally and escalate privileges. So all of those things eventually are going to help you get hashes and by cracking those hashes, you'll be able to move to the next step in our kill chain. So how to prevent that? Mitigation for stealing hashes is this specific Microsoft custom form has a patch. If you don't use the patch, the user doesn't even have to interact with it. and it will call up the SMB. If you had the patch, at least the users will get notification from Microsoft on that. Block port 445, which is SMB,

outbound. There's no reason for you to have this port open. For God's sakes, my ISP is blocking my 445 outbound because they know how dangerous it is. And we still see that being opened. and block it on the endpoint as well because the clients can take their laptops home and then when they're looking for their email at home, then the egress filtering rules are not in effect anymore. They're not going to block them. And so make sure that the endpoint is blocking port 445 outbound as well. and make sure that Windows login has a sufficient complex password because at the end of the day, that's gonna make the bad guy's life harder when they're trying to crack your passwords. All right, next is a kind

of a legacy protocol that is still available in Windows 7. It's called LLMNR. That's a mouthful. And I put this here because it's just funny the way Microsoft sometimes do things. This is the way to disable LLMNR. In order to disable LLMNR, and you probably cannot read it out here, so I'll read it for you, but basically it says, "In order to enable the policy, "or when you enable the policy, "LLMNR will be disabled. "When you disable the policy, "LLMNR will be enabled." I just thought it's funny that they're trying to make our life harder. But if it is enabled, which is basically disabled, then the attack vector works like that. You poison the subnet with a shared drive that

doesn't exist. So it's not going to be resolved in the DNS. And then the victim is going to try to resolve it. And whenever the name is not going to be resolved, it's going to broadcast a question. Who is, in this case... snare01 and if I'm the attacker I will say I'm snare01 please send deep hashes to me and then getting the hashes and whenever the client or the victim is trying to access I tell him thank you bye bye got your hash I don't need you anymore so that's how the attack works There's definitely not a need to allow this, but we still see this enabled on clients. Just disable, enable LLMNR and NBT, which is

basically NetBIOS protocol on the on the network drive. Now if you have multiple network drives, you need to do it on each and every one of them. So the disabling LLM and R can be done from group policy, but the disabling the NVT and S cannot be done because you need to look for all your network drives first, and then you can do that using PowerShell script. SMP signing. What is SMP signing? SMP signing is a feature that allows the recipient to make sure the SMP session is coming in from the identity. So it's basically digitally signing SMP packets and make sure the source is who it is. A lot of people don't enable SMB signing because SMB signing has some performance

overload. So there's a lot of performance testing that needs to be done before enabling it. But if it's not enabled, there's another cool attack, which is called SMB relay attack, which is basically a man in the middle where I somehow get my victim to try to authenticate to me with an admin rights. And then what I do is, I take the admin request and I forward it to the target and the target is like, "Okay, well, if you who you claim you are, please encrypt my challenge." I take that request, forward it to the victim. The victim is encrypting it. I forward it back into the attacker, to the target. The target is like, "Okay, you get access." And as always, what do I say

to the victim? "Bye." "Bye." So that's how it works.

So to mitigate that, just enable SMB signing. That could be enforced via group policies. And again, make sure that you're aware of the performance implications. Okay, now let's talk a little bit about Kerberos. It's kind of a headache, but if you talk about Active Directory, you have to mention Kerberos. And there are a lot of attacks based on Kerberos. So the way I kind of learned to speak about it is the analogy to an amusement park. So some amusement parks, when you go in there, you have to buy a ticket for entering the park, but then you need to buy a ticket for each and every one of the rides. Not sure it's true anymore, but that's the way I'm going to compare it

to Kerberos. So there's two things that you need to know about Kerberos. Two type of tickets, TGT, ticket granting ticket, and TGS, ticket granting service. So the way authentication in Active Directory works is I authenticate myself in front of Active Directory and then I get the TGT. This is the ticket that proves that I am who I claim I am. Now, I can take the ticket and using that ticket, I can ask to access services. For example, a file server, web server, any other services that are Kerberos supported. I get a ticket for that service and when I get it, I present it to the specific destination application. They verify that I have the permissions and then they

create the session for me. Make sense? Now, What if I had a high privileged user hash already with me? Then I don't need to ask for the TGT. This is basically someone that just already gave me their ticket to the amusement park. I don't need to pay this. I can use it. Now, if this is a super high privileged user, which is the Curb TGT hash, this is the master key. This is the... hash that has access to all the resources within Active Directory, then I'm golden, okay? Then I can access anything within my environment. This is actually my golden ticket. So in order to create a golden ticket, which is basically crafting this ticket that Active Directory will believe is valid, I need

to have the Active Directory information, the seed, the security ID, which I'll talk about later on, and the hash of highest privileged account, the CurbDGT. And then when I got all those ingredients, the way to prepare my golden ticket is, instructions are very simple. Just encrypt any user ID. It could be even a user ID that is not part of my domain because Active Directory for some reason is not even verifying the user, at least for the first 20 minutes. And then use it with the CurbDDT hash. This will prove Active Directory that you have rights to access any services within the domain.

Now, another benefit there is that when you craft the ticket, you can decide how valid or for how long it will be valid. So you can really change it from 10 hours to 10 years. Again, something that Active Directory is not enforcing. So this is the golden ticket, okay? Basically, think about it. This is the way for me to simulate or create a ticket that will allow me to access all the rides within my amusement park. So obviously the key for a golden ticket is CurbDGT and that's why this is kind of a design flow, but this is something that you need to make sure you keep and you keep rotating the password of it because otherwise if someone got a hold of

it, basically they own you, own your domain. So Microsoft had another attribute or another way to give access to accounts that called SPN. So what is SPN? SPN comes to solve the problem of accessing services that are not directly communicating with your clients. Let me explain, give you an example. Let's say that there is access to your Outlook server. When you're accessing your Outlook server, the Outlook server in your behalf is trying to access other resources in order to basically allow you to communicate with your emails, whether it's a database or maybe it's the web access or any other servers. And so if I get an access to this client, I get access to other servers as well.

Now, Active Directory allows me to look for all those accounts, service accounts, that has SPNs and just get those service accounts encrypted with the password hashes. The fact that I get the password hashes will help me later on to generate those rides, those tickets that will get me on the rides, the TGSs.

And so in our example here is if I got all the information, all the ingredients to create the TGS, the ticket granting service to a specific application server, for example, my file server, my web server, then I don't need to speak with the domain controller at all. I just created my own ride ticket. So the ingredients for the silver ticket is query Active Directory for services, identify the service account that you're trying to access, and identify the endpoint that you're trying to access, and then create this TGS, ticket granting service request, that looks exactly as if it came from Active Directory. So again, the idea here is to craft it, to communicate with the service

account without even communicating with the domain controller. So all of those things are eventually kind of taking advantage of the way Kerberos works today. So in the case of Kerberosing, those passwords eventually needs to be cracked. And so the more you make it harder for the users to crack them, the harder it is to generate a TGS. So almost there, we're getting to the end of our ride. And talking about a few more attack vectors that we see out there, and then we'll kind of talk a little bit about what we can do about it in general. And the whole idea of credential theft and where's the future, right? It's all new NIST suggestions on passwords and the biometrics. for authentication.

All of this is trying to address those attacks that I'm just talking about here. So seed security ID, that's basically the number that identify either a host or a user within your environment. That's how it works. Microsoft had to add a attribute to users that called seed history. And the idea behind it is when you migrate from one domain to another, you need to keep the old seed, the seed that came from the old domain, and the seed that came to the new one. And of course, people will start thinking on how to exploit that feature. So the way to exploit it is, well, we know a list of known seeds. Those are the strongest seeds. They have a very known ID.

Well, what if we will take that seed and we will inject it into the Existing user... Sorry. Yep. I know I have a high voice. Sure, no problem. Anyone that put their earplugs can take them off. So, yeah. Anyways, you inject those seeds. If you can inject them, then legitimately you can impersonate a user with no rights to have a lot of rights. And so it is very important when you migrate users from one domain to another to make sure you clean the seed history. And if you're looking for trust between domains, make sure there's seeds filtering and those seeds are being verified. And you're not falling into this seed history injection attack. And finally, there are two attacks that are kind of opposite of each other, but

both are built on the fact that the domain controllers for high availability need to sync between each other. So domain controllers, this brain that we spoke about, we have a lot of them and they keep on communicating with each other. If I have a rogue domain controller, I can either push some bad object into the network or I can pull some cool information from it and use it. So the sink is me being replicating the data to my rogue domain controller and the shadow is me injecting some basically backdoors that later on can be used. So a very good example of a shadow. So remember the seed history, right? The seed history, how can we leverage both DC shadow and seed

history? We can try... to add a domain, register a domain real quick, push one of the objects to include a seed history of high privilege user, and then disconnect ourself before being detected. So now we've just inserted a backdoor, and that user now can have high privileges and being used for escalation. And so mitigation there, there's a list of events within Active Directory that will give you a hint on someone's trying to get DC shadow. You be aware of those basically very high fidelity level of detection. Alternatively, if you monitor all the traffic, you will be able to see those calls that pushes those requests. On the DC sync part is replicating directory. That's you pulling information from valid domain controllers to your mimic domain

controller. And then you add yourself into the domain controller. Sorry, you add yourself into Active Directory. And then you say, hey, I'm a new domain controller sitting in a remote site for disaster recovery purposes. Please send me all the hashes. And you get all the hashes there. So mitigation there is, well, make sure you have control on those specific users that have replications. By default, not every user will allow to replicate the data back to you. So I guess the bottom line of this talk is, and I'm sorry if it was a little bit overloaded with the TAC vectors and details, but I think the common denominator of all those attacks is that they cannot really be fixed

because they are based on design flow. Microsoft Active Directory, after 15 years, still relies on NTLM hashes when it comes to authentication. It doesn't verify the source, so you can relay NTLM hashes, you can use them from different machines, and those design flows cannot backport it. They cannot even be changed because they are part of the design. Kerberos is a little bit more secure, but the same thing, as I showed you. You can forge those tickets. You can fake them pretty well. And the services that accept them don't double-check it with Active Directory. There's too much trust, I would say, there on the service side. And it's all started with Kerberos. stolen credentials. I mean statistics tells

us that almost 50% of breaches are some kind of a stolen credentials result. Whether it's phishing, whether it's social engineering, which is just using buying credentials on a dark web, what have you. Those things eventually will end up as a breach. Now, I'm not sure anyone heard of those new NIST password guidelines? Yeah. So what NIST is basically saying is, listen, it doesn't work, right? All those special characters, all this password changing rotations every 90 days, it doesn't work. Maybe we should try something else. Maybe we should look for those easy-to-remember but hard to guess type of passwords and just keep them. So you don't need to worry about all the rotation thing. Now I don't know if it's going to work or not. There's too many other

standards out there that saying exactly the opposite. But obviously we all acknowledge that the way passwords are handled today is a problem. And in order to improve it, we need to make sure we implement the principle of least privileges. At least make sure that the damage is not that bad by making sure everyone gets the permissions that they need for their work. No more. Not less, but no more. And only at the time when they need it. And make sure you do the separation of tiers. So if someone is compromised, it doesn't bleed over to the servers and domain controllers. You want to make sure that all of those zones are buffered. And use multi-factor authentication for God's sakes.

I still don't get it. Obviously for external, but also for internal. I mean, nothing will happen if the users will verify the identity and potentially you can prevent a breach. And you can look at those risks that has to do with authentication as a spectrum, right? It's Low risk, medium risk, high risk. So use adaptive enforcement. Depending on the risk, you can react differently. You can extend an MFA to verify an identity or you can just send an email to the user and tell them, hey, did you really access this finance server or is it someone else? And start using some controls. A lot of the... attack surfaces today are based on aggregate attack surface, which means taking all the logs over

all the attack surfaces and eventually kind of after the fact trying to figure out what was the reason for the specific specific bridge you just got, well, I want to have some controls in the way to potentially stop it. Yeah, even deny access and maybe get a phone call that asks you to enable that access back. That's it, you know? It's not the end of the world. So to summarize, you know, 15 years later, we still have problems that probably going to stay for a while with Active Directory. So make sure you are aware of them to begin with. And make sure on the defense side that you monitor your domain controllers because they are the keys

to your crown jewels. Thank you. Appreciate your time. Thank you. Unfortunately, we don't have time for questions for this talk, so we're going to get ready for our next talk now. Is that ever a normalized? or is that almost always an attack? Like if you see that in your logs. Which one of the events? The domain, the rogue domain controller one where it said there was an event. Yeah, yeah. Is that ever like a normal operational thing or do you see that quite? It's not a normal operational thing because you're basically a domain attacker. Your domain controller, when you add it, this event is basically telling you that someone that is not registered is just

trying to add itself into the domain. I don't know when but I'll try to put it as specific before-- Can I give you my email so you can send it? Sure, sure. The Mac of 1,000 adapters here. OK, cool. There we go. Yep that's fine. Oh, yeah.

- Yep, ready to go. - All right, thank you for joining us at B-Sides. We are in the ground one, two, three, four talk. And we are here with Robert Paul doing Enterprise Overflow, How Breached Credentials Impact Us All. Besides, we'd like to thank our inner circle sponsors, Critical Stack and ValorMail, as well as our stellar sponsors, which include Secure Code Warrior, Paranoids, and Amazon. These talks are being recorded and streamed to YouTube, so if you can silence your cell phones, we would appreciate it. And if we do have time for questions at the end, if you could raise your hand and I'll run the mic to you so that the people listening online will be able to hear your question as

well. And with that, let's get started with Robert Paul. Thank you. - Thanks everybody. So just to start off here, there's the GitHub link for the tool that's going to be released in this talk. But don't worry, as I go through the slides, it'll be there at the end, well, in case you missed it. So let's just get started here. As you heard, I'm Robert Paul from NewID. I'm also at USAF Reservists. And what we do at NewID is specializing in authentication solutions, really around zero knowledge proofs and all of that great stuff. So really my job as the director is heading a project that we call internally Project Nebulas and this is to research the weaknesses in

credentials and authentications for our solution but also just as a general weaknesses in the industry. So I want to start off today talking about the canonical breach. If you haven't heard about it, it happened just about a month ago. And this is a pretty significant breach, or it could have been if the hacker didn't just deface packages. He had gained access to about... 39,000 packages inside of Canonical and for those of you that don't know Canonical runs the Ubuntu operating system they maintain and develop it so this could have been a very significant breach and that could have impacted at least 53% of Amazon Web Services and probably a lot more than that out there on the internet but I'd like to point

out here that right there is Canonical owned account on GitHub was compromised so they had a credential-based compromised on their account. Someone logged in with password reuse and was able to access those packages and repositories. So Project Nebulas is really our internal research project where we have accumulated leaked data out there since about 2009. And so far we have about two and a half billion unique passwords out there and in all it's totaling now more than eight billion unique records. This includes social security numbers, addresses, hashes, all of that. And when we look at the list of data breaches reported on Wikipedia, This contains all of the breaches that has basically a news article associated. That only totals to about 9.7 billion records. And

the majority of the data that we've collected isn't actually in the Wikipedia page. So there's actually a massive underreporting of leaked data out there. In 2017, Verizon's data breach investigation report revealed that 81% of hacking related breaches were caused by stolen or weak passwords. And Troy Hunt, of course, everyone here may be familiar with him. He had found that 86% of passwords that have been coming out of recent breaches are all previously seen in data. So what that means is there's a lot of password reuse. And I want to point out that in 2019 of Verizon's breach investigation report, we have seen that the amount of stolen credentials being used in attacks appears to have gone down. But that isn't

really the case. It depends on the industry you're in. For example, if it's manufacturing, that's still as high as 80%. And that's because the scope of their report changed quite a bit and there's been a lot more breaches and different types of breaches as more and more techniques and vulnerabilities come about. So I'm going to talk a bit about the life cycle of a credential breach, kind of what happens from basically ground zero, which is going to be hackers, of course, compromise an organization through any variety of methods. And then of course they're going to gain access to additional credentials. Usually after they compromise an enterprise or maybe a large network, they're going to gain access to some repository of user credentials or

perhaps Active Directory hashes and so forth. And what they do next with this is people assume that they immediately throw this on the dark web and start selling it, but they actually don't. What they do is they're going to cherry pick each little piece of data out of that leak and use it for themselves to generate further compromise in networks, engage in fraud or cryptocurrency theft, etc. They share this data actually amongst themselves and this is where you can intercept it in something I like to call the cybercrime holy trinity, which is going to be Tor, Discord, and Telegram. If you're able to embed yourself in these organizations and these groups, that's where they're at. They're actually mostly on Telegram and Discord. They're not even hiding or anything.

Occasionally, they'll be on Tor or at least some of them are. the more paranoid bunch of them will be. And they share it amongst themselves in order to help them conduct SIM swapping attacks and so on. And only after it's outlived its usefulness to them, it could be months or years later, sometimes weeks later if it's moved pretty quickly, sometimes they just someone gets a hold of it and decides to sell it, that's when it will end up on Tor and or any kind of dark net forum being sold for anywhere from $2,000 all the way up to $20,000. And that's when it gets distributed widely. That's when it gets picked up in the news

and that's when we start seeing big ripple effects of people starting to do anything from credential stuffing attacks or just trying to access users accounts. So there's different parts in that very first part of the compromise in our life cycle. There's different ways that hackers can kind of abuse that or get compromised, be able to compromise organizations. And first one would be like trust relationships with B2B. And for example, Target was actually, everyone knows about that here probably, and they were actually hacked through their HVAC vendor. As B Krebs here points out, Its systems trace back to network credentials that were stolen from a third party vendor. What had actually happened with the Target hack is an HVAC vendor was compromised through stolen and leaked credentials,

and then they were able to just pivot through their network with more stolen credentials until they eventually hit the systems that stored credit card information within Target. The second one, of course, is going to be the shared user base. If someone here has a Twitter account, they probably have a Facebook account. Everyone here has a bank account, probably with one of five banks. So if there's a major breach of user credentials, say at Yahoo, where one billion user accounts alone were breached, people are going to take that and they're going to start stuffing that at other websites like Facebook and Twitter. That's a pretty low-tech solution, but they do conduct these attacks, especially on cryptocurrency

exchanges. They've seen huge spikes in fraud after there's been a leak such as Collection 1. And then the third one is, again, what I just talked about is credential stuffing, but it's actually kind of different than the shared user base because this is just mashing all the data that they can get. and it's username and password combinations, also called combo listing. You may have heard it like that. And that's when they just try it across your entire network. And this is everything from FTP servers, SMTP servers, anywhere you can take a username and password on your network. they're going to attempt to use combo lists on it. And they've been doing it for quite some

time. There's active development with this tool. This one is called Email Combo Lister Slayer. It's a well-known one in Carter forums. They tend to usually use this for fraud, but you can actually repurpose it for kind of arbitrarily blasting at like a whole network. And they always have great like 90s-esque graphics or something. And I want to point out that The words that you always see up here is it's always pointing out that they have fresh lists or the freshest data and that's because organizations often reset their passwords at certain intervals and so the more recent your data is the more relevant it is to them so they're always looking to sell the freshest combo lists out there. I'm actually going to demonstrate using our internal

data set on how easy it is to generate one of these combo lists. Two types of organizations, one's going to be a smaller organization and the other one's going to be a large enterprise. I'm going to go ahead and get that ready for you. That is the large organization here. I'm going to get a small one. Just full screen that. So I'm going to go ahead and paste here. So I'm going to pause that. And the domain there that you see is squareup.com. They did the keynote for Black Hat this year. So I'm going to generate a combo list for their stuff just for the heck of it and output it. There's a variety of command line options

there for our internal red teaming tool. mostly that I'm going to output them and I'm doing a wildcard query. I'm just saying give me everything related in my data related to squareup.com. So I'm gonna let that run. Takes a second for it to spin up 'cause it's gotta go and query Elasticsearch and there it goes. So within like half a second there or whatever that was, I got 394 results containing everything from hashes to PII and all kinds of stuff. And you can see the And right over here you can see the databases this came from. So we got everything from Outro to Dropbox, which is like ancient in terms of leaks now. And we

ended up with a total of 225 hashes. So what we're going to do is now attempt to crack them. I'm just going to let that run. So it's going to go and do its thing. And real quick, I'm going to pause that. So every red thing there is a plain text credential. I have it nice and color coded so it's easy to see. And so that's gonna come from, obviously these ones are from exploit.in and collection one, and collection one is basically just exploit.in and LinkedIn and Dropbox leaks that have been cracked. So as it's running and cracking hashes, I'm just scrolling up here in the tool to just kind of demonstrate that it's pretty

easy to get plain text passwords out of there. And so we already decrypted one hash or cracked one hash. And I really say decrypted because we're not actually cracking the hashes here. What I'm doing is I'm going back over the Nebulas data and I'm querying it. Have I ever seen a hash with this before? We basically take all the plain text passwords that are in our data set and hashes that have cracked passwords with it and then we generate a big rainbow table. for a variety of different hashes. And this is everything from SHA-3 to SHA-256. So to NTLM in this, and as you'll see in our later tool. So I'm just gonna go ahead

and skip way ahead here 'cause this is gonna keep going, there's 200 hashes and it takes about three or four minutes for it to run. So eventually though, There we go. So we could see that on a small organization we cracked three out of the 225 hashes, which isn't very much, but it did yield even on someone who's mostly cloud hosted and an organization that doesn't have that many people, it yielded 30 unique username and password combinations. On a small target like this, it's not really feasible to really just attack them on their network with it. Chances of a combo list with like 30 combos in it would actually work on their internal organization is

small. But when I look at all their GitHub accounts, if I like stuff these into GitHub and Twitter and like anywhere I can think of that they may use or any of their vendors, then it becomes really apparent that it could get pretty dangerous. You could eventually maybe compromise something related to them, like what happened with Canonical. And then the second scenario here that I'm gonna start up is going to be a large enterprise. And this is actually going to be Lockheed Martin Corporation. So every time that scrolls, that's 10,000 results. And it's just gonna keep going for a little bit there. And so we queried lmco.com and said, hey, give me all the data on Lockheed Martin. Most of this is all old. This is all stuff

from like LinkedIn, Collection One and so forth, but you'll be able to see the scale. Actually, let me full screen that for you there. And I'm gonna let that run for just a couple seconds here. And I'm gonna pause that, all right. So right up here we can quickly see that we were able to get 139,494 results in our initial query to start with. And that's everything from dates of birth to like IP addresses to hashes and plain text passes as you saw the red go by. And that yielded 22,000 hashes total and of the 22,000 only about 3,000 of them are actually salted. So we're seeing that once we get basically once we take combo

listing a target like a bigger organization we find that the password security of other people is actually like way worse so Lockheed is probably doing everything they can do I know from experience in Air Force that they use like two factor on everything sometimes three factor and you have to use like the smart card and all that to access a lot of their systems so it's It's a pretty significant number of hashes and as we'll see here once the combo list completes, it's just going to keep running. I'm actually going to have to skip forward and it's just going to keep going and going and going and going. It only takes about like four minutes

for it to go and get through all these. At the end there, what we see is we actually cracked it significantly higher. Instead of like three out of 200, now we cracked 7,000 out of 22,000. And so this is mostly because the data that this, the breach, at least this domain came from, is like MD5 and SHA-1, so they're relatively easy to crack, and also they're easy to rainbow table, 'cause most of those hashes were unsalted. And actually when I looked at this, Most of the ones that we did compromise the hashes there were just rainbow table because they were the unsalted ones. We barely got any of the salted ones. So definitely salt your

hashes. It makes a huge difference. And it's a shame that Active Directory doesn't do that. So in all, we ended up with 6,349 unique username and password combinations. Now when you look at... the scope of a large enterprise, that's 6,000 something accounts that I can try on every single authentication inside their entire organization that faces the internet. That's every FTP server I can find, basic auth, router, big F5, whatever, you know, load balancers, anything. And so it becomes a really big issue because not only can we do that on their network, we could do the same thing we could do with a small organization. We could target GitHub accounts vendor accounts, if they use a specific fuel supplier, might as well try this list on

that. And so the problem can start spiraling out of control really quickly. Alright, let's get back to our slide deck here. So this really highlights the systemic problem, and that is, are you triggering your incident response when someone else gets breached? And we actually saw the industry do this in the LinkedIn breach. There was six and a half million plain text credentials leaked by some other guy, some random hacker, went compromised LinkedIn and then dumped it out there and they were storing a combination of hashes and plain text passwords. Then somebody used I wouldn't say somebody, but then hackers out there started using tools to go ahead and stuff these credentials into different enterprise networks, and it created this huge ripple effect of hacks out there. And there was

news articles at the time, way back then, Going, oh my god, reset your passwords now. Because there were people actively exploiting this. And the entire industry responded by triggering their answer response and resetting their passwords internally inside of their networks. And so that's what's required. The problem is now that we haven't really as an industry started doing that again. What ends up happening, other than maybe like the big one, like collection one or something, or when Dropbox happened, we'll see it happen then, but not the little tiny breaches, like little PHP BB forums that like end up on Tor somewhere and it has 3.4 million users and like sometimes it might contain some enterprise accounts

in there and passwords. And that's really because of the challenges involved with having to trigger your incident response every time there's a big compromise of a company. It happens all the time. So incident response teams also must be pretty proactive in resetting breach credentials before they're abused. And that's not easy to do. As we've seen with the Wikipedia page from earlier that I mentioned, that the amount of actual reporting on all this is relatively low. So we're not able to really understand when things are being breached and it's also very difficult to be able to audit your passwords faster than they can go ahead and spray them at your network. And also incident response teams must be able to

respond anytime someone else gets hacked anywhere. Obviously that's not very realistic and And we have a phrase in our red team, and that's, there's only two types of networks, those who have been hacked and those who know they've been hacked. And oftentimes, as we saw, by the time it ends up on Tor or any of the darknet forums, the breaches sometimes are like years old, and that's only when they're discovered. So it's not really realistic. We need a solution that can... basically intercept these dumps as they happen and basically intelligence of getting in and getting these credentials basically made available to the industry. So there are some solutions, thankfully. The industry does try to collect these dumps. They often pop up for only like sometimes hours and then

they're gone and no one ever has them again. So you have to like ask someone like, hey, did you get a copy of this breach? Collection 1 had that problem. It popped up for a little while and then it disappeared and then you had to go and find someone who had a torrent of it or a backup of it. And then it reappeared once someone else started reselling it for cheaper. And so there's basically a growing community around collecting breach data and making it available for enterprises to use to audit their credentials, either their user databases or something like Active Directory. Databases.today is a great one. They host just piles of leaks. I think they have about 2,300-ish breaches, different individual breaches out there that you

can just go and download. I think it totals about 3.8 billion records or so. Also, there's Troy Hunt. He has released millions of compromised hashes that you can use for auditing your network, and the links to both of those are up there. One thing to note about Troy Hunt's data though is that it only has 555 million out of it. He has like eight billion records or so, same as us. But only about 555 million of those are actually available in his API. I'm sure he's probably working on making that available. for the rest of his data set, but for now, we're kings. We've got 8 billion documents ready to go, and we also have 2.5 billion

unique passwords that we've added to our Rainbow table, so they're already available on our API. And so, really this highlights the need for industry tools that can adapt very quickly, and basically gain and make available these this leaked data to people in a very quick pipeline. And that's what we've built out of our research project. So Nebulas Active Directory is our free open source tool that anyone can go use. And its purpose is to just automatically audit your Active Directory for any leaked credentials. That's exactly what it does. So now it's time to demo that. And this is gonna be the help option up there, but while I go ahead and get the video ready for you guys.

So there I'm just typing into the command line some options and snap would be snapshot. That is the easiest way to use this. It uses NTDS utils to just go ahead and grab a snapshot of your DIT file and your system registry hive and I'm gonna actually pause that. So, What I'm actually doing is I'm telling the tool to grab a snapshot, shred the files with 7 pass overwrite. It uses SDelete to do that. So it's a Sysinternals tool, so that's why it's a more trusted tool. method of doing that. And then I'm also instructing it to grab user status for the output and also the time their password was last set, which could be

useful to you. It's not optional. You can literally just do snap check and it will go ahead and snapshot and check and audit your network. It's pretty much two commands and it's ready to go. So I'm just gonna go ahead and let that run there. And so what it's doing now is it's going to go into the system registry hive and grab the AES decryption key for your Active Directory database. Yes, it is encrypted, but any hacker can do this just as easy as someone trying to audit your stuff. So one thing to note there. And now it's actually going in and decrypting your, or at least our domain's DIP file. And it'll take a second for that to run.

It's almost done. Once it decrypts the DIT file, it has to defragment it. And there's lots of techniques about that. Actually, pause that. So I'm going to rewind that a little bit. So we're using mPacket on the internals there. And I'm just going to explain a little bit about the color options there. The gray users here are going to be the disabled users in the domain. And then the blue ones are just what accounts, test accounts, I loaded into our test domain that are active. And if I re- wait, I don't want to rewind, I want to let it play, sorry about that guys. So as it goes, again, like the previous tool, red is dead. Those are compromised accounts, user accounts that are in the domain. It's actually

checking it against Nebulas's API right now. And it's identifying those users actively set password is appearing in leaked data. Oh, sorry. I'm going to get that last bit there so you can see that. And so at the very end here we can see that six out of the 206 accounts have been identified as having leaked credentials and then we output the result to a CSV file. All right. So there's some great benefits with this tool. We can integrate very easily into existing tools like SAEM, Syslog, etc. to alert you in real time about your compromised credentials. The way we can do that is actually every time it audited one of those accounts, it created an audit success or audit failure event inside the

Windows event log. Meaning if you're pushing logs from your domain controller to some kind of logging solution, which I don't know anyone here who wouldn't do that, unless you don't want to know who's logging in and out of your network, you're able to trigger an alert based on that event. We can also set this with the task scheduler. This was actually designed to run in the task scheduler to be completely automated. So it's like set it, forget it. And you can just audit your network and set intervals of like one day, one week, or whatever your security policy decides to do with it. You can combine it with your own scripts to actually take automated

action. So you can actually lock out users' accounts or force reset on next login, et cetera, based on the event. So when you see-- you can actually tell the task scheduler to do it when you see this, disable this account. We didn't build that into the tool for now because every enterprise is different and they may not want to impact their user. If they automatically lock the account, that user will just, every time that tool runs, will get locked out and won't be able to access resources. But you could do things like forcing the password to reset on their next login and so forth. We built this in Python, and it is open source, so

you can just go in and inspect it, do what you need to do, make any changes that you want to. And if you do make changes, be sure to push the branch up where it would be pretty cool to add features in, I don't mind. But we can run it without the Python interpreter as well. I actually have a compiled release for this. That way you can just download it, set it up in Task Scheduler and get that running. It's pretty sysadmin friendly, so you don't need to have any technical skill involved with setting this up and needing to know how to really deal with like coding and everything like you would with auditing say the torrent file from Troy Hunt which contained like 555 million in TLM hashes.

And there's also no need to store terabytes of hashes to audit. Like, our data set is like 13 terabytes or something of leaked data. So it's pretty unwieldy. It takes us like almost two weeks to re-index this thing in Elasticsearch. So it takes away having to have that overhead off you guys of having to manage that. Also, we can output to a universal format. We output to either CSV or JSON. So if you can develop and you do want to-- do more with the output of this, you're able to just, once the output is generated, just grab that, the CSV or the JSON file and feed it into like Logstash or something. And there's lots

you can do with that there. We also are using a familiar library. We built this on top of mPacket, and so if you've used mPacket before, you can use our tool. So I'm gonna talk now a bit about what actually the tool did when it ran. So, first, it clones the NTDS DIT file in your system registry hive using NTDSUtil.exe that's built into Active Directory. And it will create an exact copy of your database. You can't actually go in and, like, decrypt the live database. It's, like, locked. Even if you're, like, NT Authority system, you can't gain access to that. You have to use a volume shadow copy to gain access to that. Inside the system registry hive is your decryption key. We're actually going

to take that and then decrypt the database, as I mentioned earlier, and iterate through there and be able to gain access to the NTLM hashes that are in there. Once we have access to those hashes, now we can do whatever we need to do to process the user's authentication data. And at this point, you don't have to use our API. You can actually just output the results from CSV or JSON and then audit them yourself with your own data. You can go on databases.today and go ahead and download all of the different dumps and everything and create a rainbow table of your own. I do want to note, though, that that is a major pain

to do because the way that the data is basically stored. It's a very weird different formats. Every leak is different. Sometimes you get like straight SQL map output. Sometimes it's like a CSV that's like with a weird delimiter. And so you have to parse each individual data breach and there's like thousands of them. So it takes a long time to do. And so if specified, if you specify the check option, we will go ahead and check these against new ID's API by wrapping them in SHA-256. So obviously we're not going to have the NTLM hash sent over the network. That would be crazy to do that. So we created special rainbow table in our rainbow

table that has all of our NTLM hashes wrapped in SHA-2. That enables us to check hashes without needing to like send the actual NTLM hash over the network. And so once we get that, we wrap it in SHA-2 and then does HBS get over to the API? And the API will respond with 200 okay, the hash is there, 404, hash not found, as in it's not found in our dataset. So if you get a 404, your hash is safe. If you get a 200, yeah, it's compromised. It actually has a JSON output saying it's like whether or not it's in there. So, but you can just do it purely off of the status code. That

way it's a lot quicker. You don't have to parse JSON and deal with the serialization, deserialization overhead. And then, of course, once we have the result, we log it to the Windows event log and then dump all of the results to the specified output if you want an output result of that. And this is what it looks like in the Windows event log. So you'll see there that the source is nebulasad.exe. It's just the name of the exe file. We have our event ID, which is just 4141. And here I'll go ahead and highlight those for you. So I want to point out that if you do make an SIEM rule, what you want to

do is grab that extended information because that's where we store where they count, or which account is compromised in there. So if you make a rule, you want to grab that additional information so that way it's available in your dashboard when you get an alert. And of course, we use audit failures or audit successes to do that. That's the bottom right arrow and then our event ID there. Now, the user that you see there is just what user in my test domain ran this, in this case, the local administrator because, you know, it's just a demo. and then the computer, or in this case the domain controller it's ran on, this is useful so that

way you can audit what system is actually auditing all of your passwords. It'll tell you the user and the machine that did that. So, I want to talk a bit about what we're doing in the future. For now, we're stuck on Python 2.7, unfortunately, and that's due to limitations within Mpac. We're actually looking at porting to 3.7 soon for that very sweet async functionality, and hopefully can get that done before the end of life date, but Mpac, it's a massive project, and they still haven't ported it, so... We'll see, no promises there, but I definitely want it to 3.7, 'cause if we get async, we'd be able to audit like 50,000 users in a second, and that'd be really cool. Also more data, we're always adding data to this.

We started out at the beginning of, with like 5.4 billion and now we're creeping closer and closer to 10 billion. We had like 8.3 billion that I mentioned, it was the document count in Elasticsearch. But there is a lot of data I haven't indexed yet. So we're always adding more data to this and we're always just gonna keep throwing more stuff into our rainbow table and that's gonna be available to you guys for free. Next, we want to do is actually permutate all the passwords. And the reason you want to do this is because hackers are doing it first off, but it'll catch policy violations from people who just do really basic stuff like if

they have my dog spot 2018 and then they set their password next time to my dog spot 2019, you're going to catch those policy violations. You'll catch those users. And so we want to permutate the entire data set looking for patterns. like that and yeah, it's a lot of work. And when we tested out doing this before, it was estimated it was gonna end up being like 50 billion additional hashes added to it, so it'll be quite a lot. But it'll catch any of the permutated passwords from tools like Crunch and so forth. Also, easy mode sim rules. I didn't get any built in time for the talk, but I definitely want to release some. We use Security Onion internally, and so I can release

the dashboard and the rule set for that, so that way you can just import it and it's ready to go. And you can also do that same thing in McAfee Sim and ArcSight, and those are the two that I'm aiming to create a rule set for, and if anyone has any other, like, that they want a rule set for and then just like message us on just either email us or make a GitHub issue, like raise an issue in GitHub and like I'll totally try to make a easy mode as a rule set for you guys so that way you can just import it and it just works. And then for the very thick tinfoil

hats in the room, I also want to mention that Some of you guys don't like the idea of SHA-256 wrapping the NTLM and sending it over. Troy Hunt's API has something called KAnon or KAnonymity. That's where he actually uses the NTLM hash. He just takes the first five characters of it and sends that over. What you get back as a result is a list of four to 500 hashes. Then you have to iterate through all of those to figure out if there's a match. And that's kind of an inefficient way of doing that. And we at NewID actually kind of specialize in zero-knowledge cryptography. And hashes are actually really, really great for using zero-knowledge proofs

on. So we may go beyond KAnon and instead just use -- instead of just wrapping that in SHA-2, we may just use zero-knowledge proofs to do the check. That way you're not transmitting either hash or any part of the hash. You're just transmitting something that can be used to verify whether or not we have that in our data set. And then the last one is hopefully we can get an API to check user credentials during registration. This is just for, this tool is just really for Active Directory and that's like a very static data set. You know, you have just your users in your environment and they're not really too many added or removed unless

you're a very big organization. So it also doesn't really cover, someone out there running like WordPress or something and they have like 3,000 users. So there is a way that we could expose a check during registration so that way we can check the user's credentials as they're being registered on the site and say, "No, password 123 is like terrible. Please don't use that." So... There is some known issues that I would like to point out, because impact has a bug when dumping in TLM history on some builds, not all. I have a GitHub issue raised within there. The link's there, so you can track it if you want. And what that is, is it's only for the history. So you can still check the actively set password. That works

just fine. But you won't be able to check the history of the user. I only really encountered this on server 2016 and 2019 in AWS, not when I installed it on hardware. So who knows what that bug is, but if you do encounter that, just let me know. You can either raise a Git issue or just wait until I can figure that out. And also the pre-compiled binary can be slow to start on some systems and that's just the way it was compiled. And I tested this on a dual core four gigs of RAM AWS instance and honestly unless your domain controller is running on a potato like that then you're not going to have

any issues because it took like five to six seconds for it to start up and then go. That's on a dual core. So if you're running actual physical hardware, it'll go right away. And the application is multi-threaded, so it'll take advantage of those cores. And again, of course, any issues, raise them up on GitHub. So here's the link to the repository there. And I don't know if we have time for Q&A or not. I think we do. I think I'm way ahead of schedule. Got like 15 minutes. Yeah, we have quite a bit of time for Q&A. Yep, and I'll be available after this to come talk to and everything, me and my colleague Ibrahim. Oh, he's right over there. Hi, how's it

going? Great presentation. Quick question I had for you was how do you guys feel your database compares to, say, Azure AD password protection service that they recently brought out of beta, where they're checking against their own tools, but they say they're not using anybody's outside data. So I'm assuming they're doing their own magic, but does anybody know kind of what their repository looks like in terms of known compromised passwords? Yeah. As far as I know, most of the people that I've seen, they're collecting stuff from publicly known sources, such as database.today and any of the newest stuff when people started building these services. So really, that's going to be data dumps that have appeared since

2017. We've been collecting stuff since like 2009 and we have data that goes back to 2006 and we also have data that hasn't appeared anywhere else that we know of. It's not in Troy Hunt, it's not in anything. Like when I check my own email and everything, I see stuff that Have I Been Owned doesn't have in there. So in all, the amount of data breaches we have is 7,700 and something individual websites. Those aren't even the pay spins. So it's quite a lot of websites that are in there and I'm not sure how many Troy Hunt has. If I recall, it was like two to 3,000. I don't know, I'd have to go and

look. He actually has his data sets available. One thing that we do want to do is make which databases we do have available. So that way, there will be a list. That way, you could just compare it to different services. But we offer this API for free. You do need an API key, as it'll say on the GitHub page. And the reason for that is so we can enforce rate limiting and everything. But other than that, yeah, you can use any of those services. Since the tool itself does dump to the NTLM hashes for you into like a JSON or CSV, you can make your own tool set or expand the existing one. Just clone the repo or something and like build your own

check against those services. Question here. You mentioned the zero knowledge proofs and the distributed ledger for availability. My first question - Reaction there was, is the ledger available publicly? Because that would be how I would see it being. - Oh, are you referring to a new ID solution? So this tool doesn't use the distributed ledger. I would just use zero knowledge proofs within our own internal data set to make it so that way you can transmit it like the most secure way possible to check if the credential is leaked. But new AD solution, yeah, we do use distributed ledger for that. That's Ethereum, but that's kind of out of the scope of this talk, I think. You

can come talk to me and Ibrahim afterward for that. - We might have. - Yeah, yeah, right up front here, yeah. - If I'm understanding correctly, it's making API calls back to your server to check, hey, has this password been compromised, right? Yes. How can I trust that you're not doing something nefarious with what I'm basically saying, hey, these are my passwords? Yeah, so... That's the unfortunate thing. Actually, Troy Hunt made a very good case for that. That's why he made K-Anon. That's also why we want to either adopt that model, just with the SHA-256 wrapped version instead. So that way I only see the first five characters of a SHA-2. and that's gonna return quite literally

probably 10,000 results. I might have to bump it up to like seven. There's a medium in there. So that way I'm not aware at all of what the heck hash you queried. It could be one of hundreds to thousands. Now though, even KAnon though, when I looked at it, I was like really though that you could be able to figure out, you'd look at the incoming queries, right? And you'd be able to see that okay, you can grab each individual list from the query and now you basically have a combo list. A lot of those will fail and some of them won't work entirely because if it's not a direct match, of course, but you'll

have the list of what they queried with K-9. With the zero-knowledge proof, there would actually be no way for me to check what the heck you just queried. It's just the way that zero-knowledge proofs work and if we did adopt that into the API, then that's what I would have well documented on how that works and how we can't check it. But you could just, as I said, you don't have to use the check argument in there is separate. So you can actually have that disabled or you can go in, code in the repo and when you compile it, just literally remove the argument out there entirely. And I put the API callbacks in there

into its own separate class. That way you can just literally highlight the class, backspace it, and there are no more check. So, yeah. Hey, I have a question over here. Right over here. There you go. So I'm guessing, in my experience, to get the NTDS, the IT, the only way to get it is from a domain admin. Do you need to run this tool with domain admin? Yeah, you would need to run this tool as something that has domain admin or as something with local administrator privileges on the domain controller will work as well. That's just the way that it has to be for NTA DS util to have the right privileges. Thank you. And

this is just a confirmation. Before I send the file, I can edit and remove certain hashes for known accounts of mine. Is that possible? So I edit the file before checking. So for instance, script.gg is not sending the hash for that. Actually, the way the check function works is it just grabs the DIT file and then just immediately starts checking. That's actually a good point. What I'll do then is make it so-- impact has the way to just point to a DIT file. Actually, I already have it in there. You can point to an offline thing. So I'll probably just actually make that a feature and modify it to be able to do that. It's a good point. Yeah, but

currently no. Were there any more? Whoa. Were there any more questions? I don't even know where the speaker was up there. All right, well, thank you for coming to B-Sides, and thank you to Robert for his wonderful presentation. Thank you.

Wow, it's so hot. Robert? Yeah. So we met virtually. Doug Blau. Oh, yeah. Hi, Doug. We've actually spoken on the phone. How are you? Great, you? Great presentation. Thank you. Thank you. Yeah, I'll probably set it up. Absolutely, yeah. And we mentioned this very briefly when we spoke on the phone when we were using the store. Today was our release day for one of the four times. Yeah. It's a lot of data in there. Each individual password has up to five hashes associated with it. So we have about a total of actual hash counts, like around 13 million hashes. But that's just because it's different hash types. No, it was actually like a counter to it it was these

are like low-hanging fruit on securing Active Directory and it went through that don't let people pass your NTDs.tip file and this is how they do it, and shadow copy, so it was very interesting counterpoint to, this is how people are attacking your network, and this is how we can use those mechanisms to actually protect your network. Me and a friend were in both sessions back to back, and I was like, this is the exact opposite of the last presentation we just had. They told us not to do this, and he's saying, please let us do this. Who do we trust? But it There's multiple ways to tackle the problem though. So this is just one way of how to react now

instead of waiting for industry to adopt it. Just because really when you look about Active Directory is tied to a bunch of enterprise tools, especially VPNs. I've used this tool. I'm just like, query, boom, dump a combo list, feed it to it. Oh, look, I got six out of the 2,000-something accounts. Cool. Now I'm in their VPN. Unless they've got some kind of two-factor. Unless they have two-factor, but that only slows them down. And sometimes, depending on the VPN, you can and then if it succeeds it then prompts you for the next password. Like in the case when I was looking at Lockit, they did that and I was like, oh, well I can enumerate whether or

not I have a correct credential around now. Yeah, so sometimes it's a two-factor. The way it's set up can actually be used to numerate whether or not you're getting the credentials, and then you have to figure out a way around the two-factor. And if it's an SMS message, you might as well not be doing it. Yeah, yeah. During Pinterest, we've conducted some swaps, and it's surprisingly easy and very scary to do. It's a shame we only got the tail end of that. I would have liked to see the dog. Well, I think it was definitely recorded on YouTube, but he literally walked through how to copy the DIT file, how to shadow copy it, defrag

it, and then... - I'm glad I didn't enumerate too much on that then. Because I figured that most people here would probably be familiar with how to do that. - Anybody that was definitely in this session right before this would have been very familiar. - Perfect. - The synergy between the two presentations was, I don't know if it was accidental. - Well, we gotta give props to these sides for that. - It was done very well. - Happy accident. - Great, thanks guys. - We'll get more of them. Mostly collection one and all of that. So a lot of the data sets that really triggered people to freak out over it was collection one through five. That

was a really, really big thing. But a lot of people don't know that those are just aggregates of previous breaches. So Most of the time they check that I've seen against those and possibly Troy Hunt's API actually. I know GitHub does that. GitHub has like basically the top 4,000 passwords and then just blacklist it. They're doing password blacklisting. And there needs to be a better solution around it. And what I'm trying to do in our research team is to just be able to release and we get our sources from I have indexed in our rainbow table Troy Hunt's data and all of the databases he's got today, of course, just in case we miss something. I take anything public and then I also take our human

intelligence stuff, which is stuff where we've embedded ourselves in the cybercrime. That's where we get a lot of our data. That's where I'm able to pick it up on Telegram once, if not sometimes a year or two before it ever hits. Just sitting there watching it happen. Yeah. Yeah, yeah. It's taken a lot of work to maintain that intelligence aspect part. But mostly I'm just lurking. Anyone can go and lurk in a lot of the forums where stuff gets posted. And every once in a while, you'll have some guy who's trying to prove something pop in. You're like, oh, I've got all this data, and you can hang him on it. Like, no, you don't. You're just BSing. And they'll

be like, no, I do, and they'll throw it up in Mega. And that's how I got one of the recent disclosures at AIZ. some guy threw up a bunch of FBI data. I was like, oh, he really did have the data. Download, send over to the right organization, so they can respond to it. Great presentation. Thank you. I have a question. I don't do much work back to the pet testing. We're doing the single host. If I recover single hash, is there a way to submit that hash? There actually is, yes. On the command line? No. No, you can build your own tool. It's a RESTful API, and that actually should have done a better

job of highlighting that the API is available beyond the use of the tool. So, yeah, there is. So I don't know anything about REST. Is there anything out there like guideline-wise to do that? Yeah, I don't have the guide fully finished yet, but we'll have the ability to submit it, and I will probably make a command line tool to submit single hash. That'd be great. Because that's honestly, that would take me like a day to do. Oh, that's awesome. We can help them. Done. But yeah, definitely he bring him up and yeah, there is a way to check those things we have hashed in there are NB5, SHA1, SHA2, SHA3, and then that NTL. That's

if I can recover that password of Elvin and Privilege without... other brute force methods. - Yeah, and so that's why I use that internal tool that generated, it's extremely useful and yeah, I've breached quite a lot of stuff, especially WordPress sites. When I'm auditing some random WordPress site out there, 'cause I need to get into it because it's related to some big company I'm pentesting. It's strange how it works, yeah. - Awesome, great, thanks for the tool. - Sure, yeah, thanks.

- Yeah, it clicked. - Yeah, it clicked. And also, well, I was practicing while rehearsing, so it just got easier. - That's great, that's cool to hear. - Yeah, 'cause you wanna do this, and then it's so,

Test test one two Test test one two Test one two three four five six Test one two three four 1, 2, 3, 4, 5, 6 1, 2, 3, 4 1, 2, 3, 4, 5, 6 1, 2, 1, 2 1, 2, 3, 4, 5, 6 Test 1, 2 Test 1, 2 1, 2 1, 2, 3 4 5

1, 2, 3, 4, 5, 6, 1, 2, 1, 2. Run? Run? - Do you want me to introduce you or something? - Something special. - It's your name? - Yeah. - All right. Do you anticipate like, can't your talk is gonna-- - Talks can be up to 30 to 35 minutes, so plan in time for Q&A. - Okay, so you want me to start like doing QTalks, the cue cards, like 30-- - Oh, okay, so the cue card, not the mic. Sure. - So, I'm saying like, you wanna start leaving Q&A open like, once you hit like the 35, 40 minute mark? I mean, it'll probably be done at 3035. So Q&A is fine. Do you need audio

from your computer? No. Perfect. Just turn this down here. Then we go to this mic. It's your kill switch in case I need to call or something. First time someone's asked that. In advance, you can be like, I'm sorry, shut it off, and then turn around. Just put this on? Is that okay? You're going to switch it off. It's going to be my first time. It's more than I have. I like them. It's very special. It's probably somewhere there.

So that's on. Test, test. All right. Okay. Sorry. Did not mean to hear you. I don't know where that loop is coming from. Does anybody know where that loop is coming from? It's not coming from me. Yeah, it's perfect. Thank you. So we're going to record you. So I just want to let you know that you are being recorded. I'll get the situator right there for you. I'm going to ask you in a second to flip it back on. Count to like 10 for me. I want to check the levels, make sure everything's good. Sounds good. And you can kill that thing. That's good for backup sometimes in case something dies. But I'll let you

know. Check. One, two, three, four. What's that? One two three four five six seven eight nine, ten. Two, three. Hello. All right, we're gonna get started. So hello everyone. Welcome to B-Sides and welcome to Ground 1234. You're here for the talk, Giving the Dog a Bone: Exploring OSN Capabilities of Pentesting Tools. And this is our speaker, John Brunn. First, just talk a little bit about the sponsors, because without them, we wouldn't be able to get this conference off the ground. We got Critical Stack and ValorMail and Secure Code Warrior, Paranoids and Robinhood, along with many other ones. And it's, again, without them and our donors and volunteers that we would not be able to make this possible. Just a couple

quick notes. Please keep your cell phones on silent or turned off, because these talks are being live streamed. So we don't want any disturbances. And at the end of the talk, we're going to be doing a Q&A session. So if you have a question, just raise your hand and I'll be going around the audience with the mic. And without further ado, I will give it over to John and get this started. Is that me? Okay. All right. We're okay? We're okay. All right. So, yes, thank you for coming. My name is John Brunn. Quick background on me. I'm currently the head of security at CMD. I live in San Francisco. I've been there for about

20 years working in various information security roles there in San Francisco. And my contact information, if you want to watch me not tweet, you can follow me there. So, a quick caveat slide, this research, the opinions you're about to hear are my own, not of my employer. If you decide to take any of this information and use it, if you think it was a good idea, then you're on your own, use at your own risk. So at my previous gig, I spent a lot of time doing a number of security roles, but really what was really interesting is some security awareness training we were doing. And when we say awareness training, really it was phishing and spear phishing our employees and our executives. and that's a lot of fun

if you get a chance to do it, especially the executives. I was really surprised by how efficient the spear phishing was. Now we understand that this is to be true, but it was really interesting at the contrast to the two. And so I really started thinking about the comparison between target attacks on operating systems, open ports on the internet, and software versus kind of just you know a spread attack you know someone scanning the entire internet hoping to find vulnerable servers and so I just kind of thinking the efficiency you get in really really one thing kind of stuck out to me in my mind the question was as people are moving into the cloud

is is you know automated scanning has it kept up with with the the tooling that the cloud allows people to do essentially as you know pets becoming cattle how have the automated scan tools kept up And so I went to an interesting talk at the B-side San Francisco this year. And someone was talking about the idea of their compromising containers. And they put a WordPress container up. And it was fully patched. And they said after a number of weeks, it had never been compromised. And so I took that thought-- it really kind of tied back into the idea I'd been having for a while, which is, well, you kind of expect a fully patched system

to not be compromised. But because it wasn't compromised, I started thinking, were people trying to brute force it? Did they not care? Were they looking at the system, seeing that there was no patches available-- that it was fully patched and just walking away? And it kind of really started-- getting my you know getting my brain moving the other thing they have in san francisco besides they have drink tickets and so as a few of us kind of got together and talked about the talks we really started hatching this idea out right about this particular wordpress installation the expected variables of whether or not they should have been compromised a few drink tickets more i really

started making some bold statements like i bet you could put the username and password somewhere in your wordpress fields and no one would ever notice it A few more drink tickets. My friends told me this sounds like a great research product. Please stop talking about this. So this is kind of really the genesis of where this kind of talk went to. And here are my general thoughts, almost like a hypothesis. With the advent of someone that we're going to dive into kind of the cloud methodology that everyone's kind of taking on, I think the reconnaissance tools have to work smarter, not harder. But I don't think they actually are. I don't think they're able to

take OSN information. If you're able to give them breadcrumbs, if you will, of what we're doing, either purposely or accidentally, are the tools smart enough to either take that information, turn it into an input, and continue scanning your environment? or just flag it as an alert or vulnerability, like this is something that you should look at, this is abnormal. Kind of to the WordPress example, could we literally say here's a username and password and have someone actually compromise the system or use it against us? In the back of my mind, is brute forcing logins, especially via SSH, is that really still a thing anymore? So, what I really wanted to dive, I wanted to look

at SSH, I wanted to focus here for, I was trying to keep the scope small, but specifically because if you compromise the system via SSH, you have a shell. Potentially there's a good chance that you have root privileges with that. You can do a number of things. You can look and see the information. You have a foothold in someone's network. Maybe you want to get metadata from the system and look at other systems that are misconfigured in the cloud. So I thought SSH is really kind of a good place to start. So why did I think that the cloud methodology and why this would make tools have to work smarter? It's not really just cloud

migration. So in my mind, many companies have done the lift and shift. It's not just moving your data center to the cloud anymore. A lot of us are now using cloud methodologies. You're talking auto-scaling groups. You're talking... Immutable infrastructure, infrastructure as code. Essentially, you don't have long-lasting systems anymore. They're not being patched. You're redeploying new systems. And so for attackers, that becomes difficult because what could be a Redis server, 10 minutes later it could be a MySQL server, 10 minutes later it could be an Nginx server as it relates to a specific IP. So their information is becoming invalidated very quickly. Another interesting aspect in the cloud, it's very difficult to do reverse DNS. To

the point, I don't think a lot of people are doing it anymore. So you might be attacking a system. You might have a potentially vulnerable system, and you don't know who owns it. It resolves to Google or Amazon. And so it's not as easy to even understand whether or not you have a system that's worth attacking. You can say what you want about IDS and IPS. The attacks have, the defense tooling has... has really matured and the barrier to entry and the cost is really significantly, it's very inexpensive now to roll out these tools. When it comes to monitoring and alerting, people are using SIMS now, they've really got a feedback loop. So when you

have something like a brute force attack via SSH, Your system's going to understand that. They can alert on it. You can also start telling your API servers that this is a bad IP. You can tell a threat feed that this is a bad idea. You can find out from other threat feeds that another company found a malicious IP address. So the tooling is much improved. And specifically for SSH, two very cool and very free products, Fail to Ban and SSH Guard, are fantastic. They allow you to ban systems based on a threshold, say two to five fail attempts in a number of minutes. And they just ban them, and then they are not allowed in

your system. And SSH Guard is very interesting. It actually comes default with GCP in Ubuntu. So this leads to what I call the heartbreaker theory. So Tom Paine and the Heartbreakers had a very simple mantra when it came to songwriting. Don't bore us, get to the chorus. So essentially they said, we know what our fans want, they want the hook. They don't want a slow build, they're not doing Stairway to Heaven, right? They're getting right to the chorus. And in my mind, with the cloud methodology adoption that we're seeing, scanning tools can't spend a lot of time hammering at a server. They need to get in, get the information they want, and get out, preferably without being detected. And at some point, I think there's diminishing returns with

the IP address changing very quickly. I also don't think adversaries, if they're smart enough to write tooling and increase the tooling to look at the internet and look for this OSINT and do this research, I'm not positive, and again, these are all my theories, I'm not positive that they're going to do it against the internet. They're probably going to say, things like census, which are made to scan the entire internet, and we're going to go into this a little bit later, Those are probably the tools to work off of. My guess is they're probably writing tools against census and they are writing tools against your servers and my servers. So I call this the attacker's

new world order. It's very straightforward. It's a lot easier to be detected. Therefore, it's easier to get blocked. It's very easy to get in that feedback loop where you're hammering on a server. You're not going to use the same IP into or maybe even your same IP block into API servers. You might get banned from other companies. And really, I think it's easier if you have a known exploit, if you have one known exploit, it's probably easier to find a thousand servers you can use that exploit against than it is to try and hammer and get a thousand servers for your botnet using brute forcing. And that last point, just the idea that if you

think you have a vulnerable server, you don't know how much time that server is going to be available in the cloud. Your production bastion host could easily become someone else's development Nginx server. So attackers need to move quickly and smartly. So we're going to look at testing with SSH. And so I wanted to test these theories. I wanted to do a number of things and I wanted a lot of levers to play with. So I wanted to register a new domain. I wanted to launch systems into multiple cloud providers. I want to do some reporting on it. Obviously, we all want profit. So, quick information, some caveats. I'm using fully patched Ubuntu 16 systems. Sample size concerns obviously are going to apply here, right? So the data is

what I got, but I'll make some generalizations, but it is what it is. I'm also looking at automated tooling. I'm not looking at Danny Ocean specifically targeting Terry Benedict here, right? Maybe in the testing we're going to open these systems up to the world. Maybe we're going to catch some of these. That's not specifically what I was looking to go after. So, we launched the Nagini project. So, I didn't consider this a honeypot. I didn't really care. I didn't mind my honeypots or what a user is doing on the Internet. I didn't care what happened if someone brute forced my machines. In essence, everyone I talked to when I launched the idea is like, this

is a nice little honeypot project you have. So, after the 15th person told me that, I'm like, all right, screw it. I have a honeypot project. So I wanted to name my servers, I wanted to name my domain, Harry Potter seemed to match the names, a little bit too on the nose, asked my wife, Nagini apparently is the snake from Lord Voldemort, so I registered nagini.co, Y-C-O, it's inexpensive. I launched systems in three cloud providers. Pick the GCP, AWS. Someone suggested that I take a look at DigitalOcean. It's a little bit different because not as many enterprises are on it. Maybe it's more of the Wild West. And so it probably gave a little bit

different data than say using Azure as a third cloud provider. Just again, fully patched Ubuntu 16 servers. Port 22 opened the world. It's very important if you want brute force data to disable the brute force security software. So this actually was a problem in the beginning. GCP enables SSH guard, for example, which is a great feature, not what I was looking for. So I made sure I disabled that. Also, I did centralized logging using Elk. I used elastic.io, has a very powerful two-week cluster for free, and then from there you can adjust it and pay less. It's not too much money. It's a great platform to use for research projects like this. And when you think you overwrote your log files and you thought you really screwed yourself when

you're logging real time using file beats, it's fantastic. that may or may not have happened um so i want to go through common testing um methodology but also some of the technical details um i'm going to go into some of the details namely um I was really, you know, I'd been thinking about doing this for a long time. The B-Sides talk kind of kicked off me actually standing up and saying, okay, I'm going to do this. But one of the reasons I didn't do it was because I was really nervous that the technology was going to be over my head. We'll get into some of it. And I just wanted to let you know, once

I looked at it, it was very simple. So don't let, so we go into the details, but don't let, you know, what seems to be too big of a wall with technology stop you from actually trying to do some of these things. When I looked into the deep dive, they're like one line changes. That was very simple. So how are we going to test this? So the goal here is we're going to give some breadcrumbs here on the internet. We want to see what these automated tooling, what they're going to find, if they're going to be able to find it and mark where systems are vulnerable. So we're going to increase the logging. That's very

simple. We're going to, you know, nice safety feature of OpenSSH. They disable password authentication. We're going to have to enable that. So we did. This is very simple in the SSHD config file. Now, another nice security feature of SSHD, which doesn't help me at all, is it by default is not going to log the passwords that people are entering. So we're going to have to fix that. Now, again, we're talking breadcrumbs, how are we going to publish these to the internet? There's two ways using SSH that I identified that could do this. One is what's called the pre-auth banner. This is when you, right before you log into the machine, you get a banner, and

we'll go into that. The other thing is what's something that really intrigued me, because I've been wondering about this for years, what this was for and what I could do with it. It's called the protocol banner. So, part of my apprehension to starting this was I hadn't done UC since my intro to C in college, which was, let's just say, a very long time ago. So, I was very nervous about taking this on. But it turns out it was a one-line, a simple one-line fix to add logging of the username and password and the IP address when the person was doing. So, I probably spent eight months in apprehension and ten minutes on the fix.

The preauth banner, this is pretty straightforward. The CIS benchmark, for example, suggests or kind of requires that you have it. A number of default SSH installations has this. It's just a standard text file. There's almost, I didn't test this, there's probably not a limit to what you can put here. I imagine at some point the SSH client is gonna stop returning some of this data. But you can kind of see if you want, a lot of people say, you know, there's unauthorized use of this. I kind of say unauthorized use, but if you want to use it, here's some username and passwords you can test out. The protocol banner, now this is the part in

the bottom, this is what the SSH server will, gives during the protocol, during the initiation of the session. This is always, I always thought this was fascinating. And look, especially this piece at the right, I never knew what this is for, right? Some systems it just says Ubuntu, it says, or the OS it's being used, sometimes it's blank, sometimes like Raspberry Pi distros, it says Raspberry Pi, I didn't know why. Well, of course there's an RFC for that. I'm going to save you the time. You don't want to get happy. You don't have to read the RFC. Let's look at 4.2, the protocol version exchange. This one line just dictates what that protocol exchange is.

So essentially, what that says is that first half has to be of a certain variant, it has to match the protocols and this and that. So I don't want to touch that half. As long as there's a space, I can put anything I want to the right and have a carriage return and a line feed at the end. I've got 255 characters for that entire segment. Besides that, kind of the world's my oyster, right, on that comment section. So my servers, my Naginis that I launched, they start with this, right? This is what by default they're Ubuntu 16 shipped with. But I can kind of start doing some fun stuff and seeing, you know, if

this triggers anything, what's going on. I can slowly kind of start getting a little more bold, or I can start saying, "Please attack me. This is what, you know, I'm here." And so I did. How do you enable that? Simple version.sh. Literally, it's just the end of that line. The only issue here, which again felt daunting and turned into a simple task, to enable logging, and every time you change this banner, you have to recompile SSH. And to make it simple, you want to create, I'm using Ubuntu, you have to create Debian packages. What was the simple 18-line Dockerfile? This took 15 minutes to compile. Sorry, five minutes to compile. Had to do it every time. I built 12 to 15 of these. Very simple. You

just basically take the existing package, copy those two files I modified, and rebuild. So got my list of Naginis. One of the reasons I really wanted to register a domain for this, I don't want to go too over the top, but I wanted a number of levers to play. I wanted to see the statistics that maybe we would see in terms of driving traffic. So some systems I put in DNS, some I did not. I stayed pretty hard on GCP for the simple reason that they give you $300 of free credits, and I pivoted towards free. The AWS systems, I wanted to do comparison with those against the GCP host. And of course, the DigitalOcean

was a late add that I did. Also, some systems are in DNS, some systems are not. A few of these, I actually installed NGINX, and I registered a Let's Encrypt certificate. I wanted to see if certificate transparency logs might help drive kind of traction to these systems and whether or not that made a difference. So what did we see? So this ran for the systems, ran for varying lengths, the oldest ones, and these are all still up, maybe a bad idea. The oldest ones are 45 to 50 days. But let's see, what did we see? I'm gonna publish the data because I couldn't get the context correct. I'm gonna go over the context. The raw data doesn't work well in a presentation where you're at 12

point font. So let me kind of generalize what we saw. I used 11 Naginis. multiple launch dates, multiple cloud providers. Over 1.2 billion brute force attempts. Sorry, 1.2 million brute force, we'll edit that out. The interesting thing to see in terms of the data centers, the AWS host definitely saw more traffic. Granted, this is all sample size caveats, right? But significant more traffic in AWS. However, the GCP host saw significantly more unique IPs scanning them. Let me see if we can look at-- so for example, a couple of these had a couple of hundreds of thousands of hosts. Very interesting. Unique IPs-- for like Nagini 1, we had 2,500 unique IPs. One subnet constituted about 95% of those. It was not from North America. That was

pretty-- again, I'll publish all this again. That's pretty well down the line. The interesting things were you found a few IPs from random from different hosting providers that stopped exactly at 100 requests. And I had like 15 of them, flat out just said 100, 100, 100. I started doing the research on each one, each from a different, one was from DigitalOcean, one was something in France, one was from AWS. So it does seem like there was some smart control of all of these. I'm not gonna say a botnet. There's some interesting research I did also run a lot of these through gray noise. And of the top, like, even the ones that only scanned once,

nearly all the IPs came back as saying they had been known by gray noise, which is kind of a threat feed you can use to query. It's a great tool to query what these systems are. Some of them came back as known malicious. Some came back as probably scanners. But very interesting, 95% of the hits were from the same subnet. I didn't have enough data. It seemed that if it was in DNS, and this is not reverse DNS, just normal DNS, that it made a difference, but I didn't have enough data to say that. I do have enough data to say the biggest surprise was DigitalOcean had very, very little traffic. I was kind of surprised by this. Most of the servers were easily averaging 5,000 to 10,000 brute

force login attempts a day, and DigitalOcean is less than 800 per day. And the systems ramp up very quickly. So, especially in AWS, if you assist in AWS the first day, you're easily getting 5,000 hits against it. TLS, pretty, there is almost no discrepancies between the host with TLS and the host without TLS. I don't really, certificate transparency logs didn't seem to do anything. So, my hypothesis. We could do all this stuff and nothing would happen. We could say, here's my username, here's my password. Not suggesting this is a good idea. but we could do this nothing would happen so what happened nothing happened no one attempted once to brute force my systems so i

don't you know this was my hypothesis i struggled with this because this was my hypothesis granted it was you know a drunken hypothesis but i stood by it um i did not think this was going to happen i was really hoping it wouldn't i was hoping we were going to get some attacks i was hoping then we can start laying more breadcrumbs saying can we start sending the attackers to different spots It didn't really happen. So what's next? So I wanted to look at-- and maybe this was expected. I did not look at the well-known tools ahead of time before I made this hypothesis. So let's look at that. Let's see the data they're gathering.

and take a look to see what they're seeing. So I'm going to start with Nmap. It's a great tool. It just hit, I think, its 22nd birthday. It's very basic, but a lot of scanning tools either use it or it's kind of the engine of what they were built to do in a very basic sense. This is hitting a host that I did not change the banner on. You can kind of see it seems to know what it is and everything looks normal. Now I take a similar system. Everything else is the same except I changed the banner. I'm going to overlay these to kind of make it more apparent. It doesn't know the operating

system. It doesn't really know what's going on. It just fingerprinted that comment section. At the bottom, it just files like, I don't know what this is. If you want to submit this. There's a really red flag saying this might be vulnerable. It just says, hey, I've never seen this before. So I started looking at all the other, you know, the top 20 OSN tools and researching tools available. And, you know, I knew Census and Shodan, those are the big dogs and they're fantastic tools. But really all the other tools I looked at just to make sure I was covering all essentially seemed to use Census and Shodan as their search engines, right? So that's what

I wanted to take a look at. What did they see? Did they find my systems? Are they suggesting that my systems are vulnerable? Again, taking those breadcrumbs, suggesting that this is a vulnerability or something to look deeper into it. Well, yeah, this was actually great news. Census found my systems. If you're not familiar with Census, it was based on a project out of the University of Michigan, which I'm legally obligated to say. Then they gave me research access, and I'm an alumni. Then they-- yeah, I saw you outside. Ohio State guy. They scan the entire internet once a week and they grab a number of information. In this case they happened to grab all of my SSH banner information. What about Shodan? Yeah, Shodan found it too.

Now, I should note, census and Shodan are not exactly the same parity features. I feel for this use case they're pretty much the same thing for my use cases. I know they have different use cases. The other thing is Shodan does not, to my knowledge, is not supposed to claim to sample IPs. They don't claim to scan the entire internet. And I actually found that to be accurate as well. Shodan found about 80% of my systems and Centus found 100%. So I started wondering, well, maybe I can't be the only person putting the word password, for example, or username in the SSH banners. until I looked at census and sure enough, I apparently am the

only person that was dumb enough to put the word password. So when you think of that, maybe it's not unusual that we didn't find any data, right? Maybe this was expected. The other thought I had was maybe people are like, well, this is a honeypot. Then I'm just not going to attack her. I'm going to give you my crap data, my brute force systems, and I'm going to do something over here and distract you. That's absolutely logical and could have happened. Shodan has a nice feature. It agreed with me. They're not honeypots. So I didn't apparently red flag anything in any of these scans. This was... This kind of really surprised me. You know, we talked about the two banners. I really harped on the one I was

really interested in, which was the SSH protocol banner. Remember that other banner we did, right? This is the pre-auth banner. Most of your systems are probably using this in some form. Hopefully you're not as dumb as I am. Nothing indexed these. Nothing found this. So, when I started looking, I'm like, this was very surprising to me. And I started getting a little, I won't say nervous, but I started thinking in a different direction from where my original statement was. This is a working SSH public key in that user that I published. So, this is on a system that's being scanned on the internet. No one's attempted this, which isn't really surprising once I looked into it. This is a census's statement their faq they actually don't do

authenticate they don't try out to authenticate and you only get this battery if you try and authenticate this is where i get scared if no one's detecting this how do i know as a security practitioner how do i know that i don't have engineers who are doing this if no if no one can detect it and trust me i've known plenty of engineers who are feel stifled by security if they knew they could share a key using SSH banners, they would do it. I don't know if I have a tool that can detect it. So now I'm starting to get nervous. So where do I go from there? I wasn't thinking about Nessus, Qualys, Nexpose,

how those would fit into this. Because I don't think attack, you know, they're expensive. I don't think attackers are using them. But as security practitioners, you probably are, even if you're just mandated because your compliance team needs you to use a tool. So let's take a look at that. Did that find this OSN data? And did that say, this is something to look at? This is something that's misconfigured. It's a vulnerability. but no of course it didn't so koalas was pretty the koalas report was pretty interesting this version of ssh apparently is susceptible to user enumeration on the same pages report it actually gives a username and they don't notice that the other thing i

thought was kind of funny is in ssh banner they flat out say there is no exploitable information so the commercial tools aren't going to help us here either And that just came back to the same thing. I use those tools, I figure they're all kind of the same. So I got a little pissed and I'm like, would someone hack these systems? And so I did a data dump. And what happened? Nothing happened. I'm not saying you should do data dumps. I'm not saying you should use it. It's just really driving home the fact that I don't think the automation is there to make smart decisions based on information. So, kind of summarize what we found.

I don't think, you know, to that point, I don't think scan tools are equipped to really deal with it, to either mark it as a vulnerability or just tell you, hey, this is different. I've never seen this. You should look at it. The other interesting aspect, I'm not positive any smart scanning is happening, and data points, sample size. So, give you an example, if you launch an Ubuntu server on AWS, unless you spend a lot of time reconfiguring things, there's going to be an Ubuntu user on that machine. They launch with it, they want you to use a credential that you have to download or you can automate. Therefore, if I view an Ubuntu banner,

if I don't muck with the banner, And you know it's an Ubuntu system. And if you're using DNS, you know AWS publishes their IP range. You know it's an AWS host. You know it's an Ubuntu banner. There's a 98% chance there's an Ubuntu user on that host. On the system I had that, it was getting 5,000 scans a day, brute forces a day, eight attempts using the Ubuntu user. That seems like a pretty easy marriage for a logical attack, and no one's even doing that. sh pre-off banners they're not being indexed and they get to that a minute that's that's really interesting to me uh for a number of reasons um you do wonder if

census and shodan is getting so much information it's just signal to noise it just you it's needle and haystack it's hard to kind of dive into this and last do not do this at home this seems like it's it really should be uh research purposes only um but i do get you know this isn't and this is not where i started i am now nervous about I know engineers who will do this. And I'm nervous I can't detect them. So what's next? Where am I going to take the research? Or where do I'm hoping people will take this? I launched a couple blogs. Back to the WordPress example. I'm going to leave these live for

a while. Registered Buckbeak as well, another Harry Potter reference. I've left some breadcrumbs on there. I'm kind of interested to see how long they get picked up. What I can tell is, and probably if anyone's ever done web SEO, why they're always complaining, I think the web is a long game. I think it takes a long time to get indexed. I don't think two to three months is a good sample size to even know who is picking up or scanning things. I'm going to leave these up for a while to see if I had to hack PHP to get the username and password on this. That was awful, by the way. Hacking C after 20

something years was much easier than hacking PHP. Jesus. But it's done. I really want to do more of an IP address breakdown. I'm interested to see if I was surprised how much data there was. And so I didn't have a chance to analyze it as much as I wanted. Were the systems coming in and hitting every five minutes to try and detect that they weren't gonna be caught and then start ramping things up? Was it a bunch of bad IP addresses that they just flood and they're blocked, they don't care, and when they were not blocked, they send in the cavalry and they start looking with the real IP addresses? I'm kind of interested. The

one thing I didn't do, just kind of i should have done was actually just launch a system with the ssh guard with the security controls in place and kind of see what that data looked like and matching those ip addresses up so i'm going to kind of i'm probably going to take that on the pre-auth banner indexing for completely different reasons is extremely interesting. In the world of a lack of reverse DNS, you don't know what hosts are. But for CIS benchmark, which most any company dealing with any sort of compliance kind of has to follow, you have to have a banner. Most likely your organization's banner is probably going to be somewhat unique to

your organization. So if I can fingerprint the world, and it's going to be expensive, but if I could fingerprint all the banners in the world, I can probably start picking out what IP addresses are related to each other. That is not exactly what I was thinking of when I started this, but that just kind of got my mind rolling. It would be very interesting. Census does not pick up this information, but I'm kind of interested in some other things. I do want to look at some more advanced tooling. I don't have any hope that the commercial tooling we talked about is going to pick this kind of stuff up. So I don't have any hope

that we're going to get any coverage for our own coworkers who might be screwing things, having the same dumb idea I had. There is someone, I don't know if the person's here, someone emailed me about something called Intrigue Core or Core Intrigue. It was like a very interesting platform that kind of backends the senses of number OSN and it seems like you can make it very extensible. I'm very interested in using that for something, especially around the banner fingerprinting. So that's all. So that's my GitHub. I'll probably post this as well as the IP address data as I analyze that a little bit further.

So if anyone has any questions, you can just raise your hand and I can go over and bring the mic. Hi. Very interesting talk. I think if you look at it from the perspective of an expected return, though, I'm not surprised at your results at all. Because, you know, probability of success times a reward, where's the reward? Yeah. I really thought that if something was to happen, I really thought it would be stumbling upon it, not necessarily an automation tool. And that's one of the reasons I really increased the verbosity of the logging. So one thing about increasing the verbosity, you can start seeing the error logs of when someone's doing a protocol exchange versus

actually trying to authenticate. So I wanted to have that information because to that point, my assumption was it would not be an automated something. Someone would stumble upon it and say, this is stupid. All right, I'm going to try a few things. and then I'd have that data to kind of use to say, well, did that person scan it? Are they using something else? Then it kind of may lead me in a different direction. Anyone else have a question? All right, I guess. Thank you very much.

That was cool. I like that. Kind of your thought process around that and just that experimentation. So that's a good story. I had so many experimentation ideas. If they did get hacked, I was like, okay, can I send you over here? Can I get you to do something like that? And nothing happened. But you just stuck up a... Did you turn that off? I did. You threw up a bunch of stuff. I don't know. I have that sort of impression like if it's open, it's going to get hacked very quickly. I want to find out how to do the model. Yeah, but they don't know. But you're thinking they're going to try to see if there

is value because they don't know. They wouldn't necessarily know that there's value. There's a bunch of things you have to prepare for something like this. Well, it was a throwaway domain.

- So basically what happens is now they're pregnant as if it's a same domain. So now you're an oyster issue. Yeah. Yeah. Yeah, I mean, the way I was approaching it is if I throw a bug bounty asking about it, it's 50 bucks someone's going to look at it. I was trying to see what would happen if I didn't force anyone there. But you're kind of proving it. You're kind of proving it. This is a bad way to put it. We have a security by anonymity. No one cares. I mean, that's the thing. Intrinsically, you're like, oh, my gosh, this stuff's hitting against my firewall and all these bad actors. Oh, my gosh, I'm glad I have my

firewall up. That's not fair. They're really not trying to hit anything. Have you taken a look at re-logging your reverse lookup? I was looking at reverse. Reverse DNS lookup. Did you run your own DNS server? No. Did you run your own server? No. You'll get all the content in. Yeah.

So yeah, there's a bunch of good stuff already. Yeah, I didn't want to go down. I mean, there are better tools than this. I'm just saying, there are better thoughts than this for the actual enticing. My question is, is it going to be just simple? It's an internal thing. Yeah, I was thinking of one that's going to be saying, I'm not going to do it right now. Enterprise is something that I'm interested in. - Yeah, that's what I'm here for. - So, and ask the other thing. - I don't think it's gonna be that long before. - It's gonna be about a year and a half. - And you're only using that thing before, so. -

Yeah, yeah, I don't. - Yeah, that's good. - Thank you. - Yeah, Paul, does that help? - Good, yeah.

Right. Yeah. Yeah. Yeah. - That was my next step was to talk to some European threat companies that have like, I mean, you know, pre-pods and what happened to them compromised. That definitely would be like doing something to the building anymore. But it wasn't actually a fun thing to do. - So they make it 100% IP and stuff, and we're putting it on a store so we can see what it does without it. At least it's not a stable fucking house. But they may still, to each and what I can tell, - That was my dad's social media. - Oh, no, I am. - I mentioned I'm actually a corporate leader. - That's rude. - That wasn't

in all my thoughts. - No, but it's a cool observation. Thank you. Sure. Let's back our business.

I have a clicker with me. Do you need audio? I do not need audio. I have a video, but no audio. Audio is evil. Water? Actually, I have hot water already. We're trying to reduce the single-use plastics. It's important. I'm going to present her. Today, a clicker pointer works. Oh, the one with the momentum? Yeah, this one does a thing where you can do that. But you've got to install some shady accessibility software that has accessibility access. You have to install Logitech software with root access. I don't see why it needs it, but it does. OK, I'm all set up now. I can talk to you. We have a question. When it comes to introductions.

- Fido, like the dog. I actually don't know how it's pronounced. I don't know if it's Fido. - Or F-I-D-O, they would say it. - Fido. - Fido's more fun. - So why Fido security keys? - WebAuthn or WebAuthn. - I feel like I did not know that. - It's spelled WebAuthn. - Awesome, it's pronounced Gen Con. - Yes. - Awesome. - Anything else you'd like me to say when I'm choosing a good champion? - The practice has gone like 48 to 52 minutes.

I have a timer down here, so I can keep it on time. If I'm getting close, you can tell me. So you want me to stop at 50? Yeah, we'll do 50. Yeah, that sounds great. You've done this many times before, have you? I do lots of speaking. I'm not really an engineer. Actually, can I help? Sure. Sorry, don't mean to be crude. Just go straight down the middle. Okay. Hi, hi everybody. Are you going to wander around or are you going to stand behind here? Okay, cool. Actually, yeah, the recording rig's over here. The camera's just going to be on you. The camera's for where we are live streaming everything. So the camera's going to see you,

and then we've got your graphics over there, and the laptop's over there. It's going to stream a picture-in-picture to YouTube. So we'll follow you around as much as you possibly can, but if you run, it's going to be hard. All right, I will try not to spread that. Actually, if you don't want to go, don't go. We'll try not to go. We got the revision. That sounds wonderful. We'll just hold these. And it goes on the side just like this. Yes.

- Yeah, and people get tired the first thoughts. - Software developer counts. I usually talk to software developers. Not as much. The conference organizers tend to care more than the attendees. Yes. Not as much, all that.

Yeah, we just want it to work. It's like, yeah, it'll work for you and other people. Just in case. Yeah. Get like the double mic. Screeching scream of feedback. What cons have you presented at? A few. This one I've done at a few DevOps days. I'll go with my favorite con to present at was the Robotic Telescope and Students Astronomy Education Conference. Yes. I haven't done much around the academic circuit. I do most around the industry circuit. Yeah, maybe I'll give it a pitch next time. It's been a long time since I went. Last time I went was like... Yeah.

I think they're having a security track this year. I'm going to have to check it out. But two years ago they were at Disneyland, right? They were in Orlando? It's in Orlando this year. So they've been switching between Houston and Orlando. Oh, it goes back and forth. They said it's fine in Orlando here. Last year it was Houston. This year it's back in Orlando. So next year it might be in Houston. Yeah, the one I went to was in Baltimore, which is cool. Baltimore's fine. I did the Houston one. So it was Houston, Houston, Houston, Orlando. We've got a few Ustons. I have no idea. I would say like, we are near so I can start. But yeah, you should definitely try to get in.

I was very sad in my talk to be accepted. What was your talk this year? This was actually a seminary diversity talk on fraternity and non-broth, and non-broth, especially in the tech industry. It was also, or it was on me. I couldn't work on it, that's why I'm here. That's how it happens. Conference driven development. Yeah. But I'm going to try it again. Security talk. Those are really important. We should all submit. I'm actually really excited to see if there's a track. Because I had one of you, because you were submitting to talk. It was like, oh, so here are the different ones. And security was like, why don't we use a security track? Because I have, at least

when I went, unless they changed it last year, we used a security track. So that happens a lot. It gets broken up. Yeah. Hi everyone. Good afternoon to the last talk in this room and welcome to B-Sides. So in case you didn't know, we are in breaking ground. One, two, three. So I'm going to introduce our speaker, Jen Tong, and she will be giving a talk on why security keys and web authentication are awesome. So before we get started, just want to give a few messages and thank our sponsors. We've got Critical Stack, Valley Mail, Security Agency, Silence, Secure, Code Warrior, and many more. And without them, we would not be able to make this event possible. Also, a quick note that this event

is being live streamed, so please be mindful of your cell phones and turn them off or put them on silent so like that it doesn't disturb anything. At the end of the talk, we're going to have a couple minutes for a Q&A. So I'm going to be passing the mic around. So if you have a question at the end, just raise your hand. And with that, I will turn it over to Jen. Hi, everybody. How's it going? Good. How's your day been so far? Great. Yeah, mine's been awesome. I swear I haven't been cramming to polish the deck. But that's how conferences often are. So yeah, so we're going to be talking about web authentication

and FIDO. But before I get into the actual content, I'm going to ask you some questions. Because I usually talk to a slightly different kind of audience. I spend most of my time talking to software developers. This is one of my first times talking at a security conference. I'm excited about that. So I need to learn a little more about you. How many of you know what FIDO and web authentication are? Like what these things are? Okay, about half. How many of you use web authentication or FIDO keys today? 20%. Hopefully I can convince you to raise that number up a little bit, but still better than when I go to most software developer type

stuff. How many of you in like an offensive security exercise, like red teaming or something, how many of you have taken over an account and used it to cause complete havoc? Oh, good, 30% in the room. Yeah, this is when I really started getting into web authentication, having had this experience and seeing how fun and horrible it can be at the same time. Thanks a lot for humoring me. So let me tell you a little bit about myself. Hi, I'm Jen, Jen Tong, as you heard just a moment ago. Professionally, I am a developer advocate at Google. My background is mostly in software engineering, a little bit in silicon process and robotics. But for the

last like eight or so years, I've been doing developer advocacy, which means I go around with developers and help them make good decisions. In the past, a lot of this has been around like making good Google Glass apps or using Firebase Writes. But then around 2016, I started getting kind of scared of the state of security on the world and started turning my attention more towards security topics and the cloud and stuff. And it's been over those couple years that I've had the opportunity to explore a lot of different stuff. But web authentication is one of the things that's really kind of struck a chord with me. And I think it's really cool and I

hope I can share that with some of you. There's a couple of really big reasons off the top that I think make it awesome. One is that I can go to pretty much anyone, no matter what their role is, whether they just use Gmail or they're a software engineer or a CIO or something, and I can talk to them about this topic and give them some actionable advice they can do right now. You don't have to be a software developer to get a lot of value out of this concept. So I think it's pretty cool. Hopefully I can convince you of some of that. One more meta slide, a little bit about this presentation. As

I said, most of my speaking is to software developers. So when you talk to software developers about security topics, you got to kind of start a few steps back. I've adjusted it. I've cut a lot of the fluff from the stack. So I'm trying to make it leaner, but it's still going to be a little bit different of a flow than you might've seen from other B-Sides talks. In other words, if it's not for you and you want to storm out and go find a different talk, I won't be offended. But I would love to talk to you about the topic later. The other thing is, like, this is half intended to, like, teach you

about the tech and convince you that this is actually a useful technology, and the other half to convince you to hopefully use it yourself to protect some of your accounts or maybe deploy it at your organization. Okay, let's get into actual content. Let's talk about context. Let's talk about authentication and the state of the authentication world as it exists today. So despite all of our best efforts, authentication looks pretty much like it looked 40, 50 years ago. We have logins, usernames, maybe email addresses. We have passwords. We type them in, we remember them, kinda. And the reason it's been so hard to kill is because passwords are just so usable. Most interesting computers have some way to input information into them, otherwise they're not terribly

useful. There are some exceptions to that. So if you can put input into it to use it, you can probably put a password in. So it's very easy to implement. You just take the existing user input methods and you just use them for authentication instead. Yay, authentication done, move on to the next feature. But we've been plagued with problems the whole time. People have been choosing terrible passwords since the 1960s. They're still choosing terrible ones today. I could probably update this slide, but the data is still going to be the same. Nothing has changed. It's sad. And even way back in the 1970s, we realized that this was a problem. We were really worried about

people selecting bad, easily guessable passwords, and then some person in a hoodie coming along and guessing their password on the mainframe or whatever, and typing it in and causing those problems. And way back then we came up with techniques to try and help people come up with better passwords. We came up with like policies, like hey, use these things to make passwords that are safer and harder to guess. And me with my software developer hat on probably implemented a bunch of these things server side. Because you know, rules like this are really easy to write in code. This is only a few lines of code. And we hope that people will come up with high

entropy, wonderful passwords like this. But we know the reality is they came up with passwords like this. Hence the name of the track that this talk is part of. And even if we were fortunate and somehow through training we convinced someone to come up with a really nice, wonderfully, you know, difficult to guess password, we didn't see the other thing coming years and years ago, which was they remembered that password, but now they have like 700 accounts scattered across the internet instead of just like three accounts on a mainframe or whatever. So they take that super secure password and if it's good enough for their bank, it should be good enough for their email and

good enough for their employer. Which is exactly what gave rise to another threat. People realized this was happening. So rather than go after XYZ Bank, they would go after some fun new mobile game or even just make a fake one altogether just to harvest credentials. Either way, they get access to those credentials a combination of breach services, bad technique that somebody else did, and then our same old friend in the hoodie comes along and starts stuffing those credentials into a whole bunch of other services, and bad things have happened. Ironically, if people had gone against some of our early advice and just written those things down on Post-it notes and stuck it to their computer

and used a different password for each service, they might be in a better place today. But I digress. But it's okay. Years and years ago, we came up with another way to help defend against guessing and also, coincidentally, against credential stuffing type attacks, which is one-time passwords. And they're really cool. Kind of cumbersome sometimes. The idea is you get a code from somewhere, either sent via some other previously decided upon trusted back channel like SMS or email, or you share a secret ahead of time and use that to generate these codes with hashing, either on a mobile app or with a dedicated hardware device. And they're great and all. They protect against the credential stuffing. We're in a much better place than we were before, but

phishing, where someone kicks up a fake service, convinces someone to type in their OTP, still a problem. And one of the things I encounter with like software developer audiences is a lot of people think phishing is a solved problem if they have OTP deployed because those one-time passwords can time out. And they don't understand that like the threats have kind of moved ahead and that's not really true anymore at all. People aren't using like five-year-old OTPs. They're not trying to. They'll just, you know, man in the middle and do it in real time. So this is still a threat. And I think that The fundamental issue isn't really about user training, although user training does

help a lot. I think the fundamental issue is like a cognitive load because our users don't have some telepathic link with a third party service to transmit that secret over. They're burdened by technology. They have to interact with some kind of user interface. And then they have to, what we're telling them to do is to evaluate all of those interfaces, which vary greatly from one platform to another, from one device to another, and use a bunch of rules to make your best guess if this is the real site or if you've somehow been tricked into going to some random third party site. And although you tell people, and when they're at their highest mental capacity,

when you're talking to them during a talk or whatever, it's like, no, no, this can never happen. But it's still really effective. The data shows that it's very effective. And especially if you target and you find someone who hasn't had their coffee yet, and they're in the process of buying a house and have been trained to click on DocuSign links left and right, you can trick people into doing almost anything. So we need to find a way that reduces this cognitive load and allows us to have secure authentication. And that's what we're going to be spending the next 40 minutes or so talking about. And the solution is obviously FIDO and WebAuthn. What I'm going

to do now is I'm going to kind of go through it a layer at a time, starting from the most abstract kind of hand wavy levels, and then adding layers of complexity and reality to it. And the reason I take this approach is because if I have a mixed audience and people start to like doze off in the middle, at least hopefully they got some value towards the beginning and have, you know, some understanding. At least at this point, they hopefully understand that like something does exist that's better than a password and one time password. So to that end, we're going to talk about the solution and the abstract. Then we're going to concretify it

a little bit with FIDO and the WebAuthn protocols. And then we'll have enough to talk about using it to protect yourself and your accounts. And then finally, with time left, we're going to talk about implementation on your own service so that your users can be protected, too. Yay. Okay. An abstract solution. How math can save us. Because math always solves our problems, right? Creates new ones for me. Despite having a math minor, I still find math painful. It is hard. But awesome. And it comes back to public cryptography. Because there are all these cool mathematical things, there are actually several different techniques we can do to generate private and public keys in pairs. Yay! And they have a neat property that if you use

them to either sign or encrypt something, it can only be accessed if you have access to the other key. And a lot of people superficially think that this is just about sharing secrets, but we can also use it for authentication. And that's where we can start to talk about an abstract scheme where we use public key cryptography to authenticate users. So for this and any other authentication scheme, we're going to have a couple of different phases and you're going to grow tired of it by the end of this talk because I'm going to break this down into the registration phase and the authentication phase over and over and over again. Registration is about setting up that initial identity, doing that initial trust, and then authentication

is about verifying that the same user who did the registration is the one who is present. So, let's do some sequence-y diagram-y stuff for registration. We have a user. Our user wants to use some kind of service. Probably something cool. It's probably a mobile game. They somehow generate a key pair, as if by magic. Maybe they do it by hand on a notebook. You could do that. It would probably take a while. We don't care right now, because we're still in the land of abstract ideas. They securely transmit one of those keys over, which they now declare the public key. Fortunately, we have wonderful technology for secure transport on the internet, transport layer security. And the FIDO and WebAuthn protocols rely on that existing

for some of the verification, which will come up later. But we can just assume we toss that key over in a secure way. The server stores that public key. Yay! Registration is done. It was easy, wasn't it? Now we can move on to authentication. So our user's already enrolled, essentially. The public key lives at the server. The private key is being kept secret by the user. And when they want to log in, The next time, the server goes, hey, we did a key exchange before. I'm going to generate a bunch of random characters and I'm going to send them over to you. I want you to sign them with the key that you have that you've kept private and then send them back to

me. So the user does that. Then the server can then verify that that signature is, you know, accessible by using their public key. The signature is correct. That it matches the challenge. And ta-da! We've now successfully authenticated the user. So it's really that easy at the highest level, but of course the devil is in the details. There's like a million little corner cases where things can go wrong, but I just wanted to get the highest level, most abstract part of the story out first. So let's take that super simple three-bubble diagram, let's concretify it, let's make it actually useful, let's make up some new words. And we're going to do that in, of course, the

software engineering way with protocols. Who doesn't love reading protocol documentation? I actually like it. It's probably why I'm talking about this right now. So, to kick it to the next level of concreteness away from abstraction, we're going to need a protocol that we can use to have a common way of doing this interchange that hopefully doesn't involve generating key pairs in a notebook by hand. And it's going to have something to do with hardware at some point because that's, you know, that is that intermediate part that I want people to stop evaluating with their eyes and little mental model heuristics. So, let's dive into the protocol. This is hopefully the place where I save you reading through the entire set of specifications, which is only

a couple hundred pages, but it's a little dense. It's better than Bluetooth. Oh, gosh. So once you start reading into the specification, or even if you read articles that people have written about this topic, you're going to kind of hit jargon really thick, really fast. So before we go through that protocol in a little more detail, let's do a little bit of jargon, glossary, defining some terms. So FIDO2. is the like umbrella term. It is the container for all the other protocols around authentication. And it's a little bit hairy because if you see FIDO without a number two, it could be referring to the standards body or it could be referring to the previous version

of the protocol depending on which specification document you read. But if you see FIDO and the number two in a contiguous word, it refers to the protocol. There was a little bit of renaming of stuff when FIDO 2 came out. You have the servers you authenticate with. These are called relying parties, or RP, all over the spec. So if you see RP even in the actual messages being passed, it refers to the relying party. We have different kinds of authenticators. We have platform authenticators, which means they are built into something else, a more general purpose computer. And we have removable security dongle style authenticators. They're also called roaming in a few places in the spec. Those authenticators communicate with the user's client, their web browser. over another protocol,

which is called CTAP2. It could also be referred to as just, or rather if the two is missing, it is referring to the older version of the spec CTAP1, which has been, like the terms got rearranged. It kind of rewrote history a little bit and decided that that also means U2F. But if you read old docs, a lot of people use U2F to refer to the overall old protocol, which I guess would be FIDO1 now in the new terms. Anyway, there are two different wire protocols that the authenticators can use. They're both still secure, like there hasn't been any huge flaw discovered in CTAP1. But CTAP2 came along with slightly better binary encoding, a little

bit easier to use. We also got a few more bells and whistles in the feature set. But if you have a CTAP1 device, it's still going to work in the FIDO2 spec. And then finally, WebAuthn comes in. This actually defines the protocol that is used by the application running in the web browser or the mobile app and the way it communicates with the server and some of the APIs on the server side. And this is a W3C spec. So I use this a lot to communicate this because this is one of the only non-ambiguous terms in the entire set of protocols. Everything else is a little bit confusing. So it's a good term to hook

onto. Okay, we have terms. Let's go through those same two ceremonies, or as I like to call them, dances. Let's go through the two dances again. First, registration, and then authentication. So, now we have a slightly more concrete world. Our user has a little security key with themselves. And I'm going to assume that when they're registering, they're already authenticated. Either this is happening as a result of the fact that they just signed up for your service and you're taking them through this next step, or they're already an authenticated user and they're coming back to add a key long after the fact, which is the much more common case. And they want to add a key

to their server, the relying party. So the user goes, tells the relying party, I want to register a key. Great. The server sends back a whole bunch of stuff, but there are a couple of things that are kind of key. They send back a challenge, a cryptographically good, pretty long string of stuff. So this is used both for the verification, but this is also used as a potential source of entropy by the key. So this has got to be good randomness. I think it's like up to 68 bytes long or something now. And also a relying party ID. I think it's just RPID in the actual wire protocol. And this should match the domain name

of whatever they're communicating with. That's used to help prevent the phishing. Once the client gets that, it's going to re-encode it, send it over to the key, which is going to do a little bit of verification. It's going to verify that that relying party matches the domain name that's visible, the origin. And it's going to make sure that it supports all the features that have been requested. And then it's going to generate the key pair, either using its own internal entropy source or using the entropy from the challenge. It's going to send back a payload back up to the relying party, which includes the signed challenge, the public key that is used that was generated on the authenticator, the origin that it saw so that

we can do domain verification yet again server side, and a credential ID that's been generated by the authenticator, which it expects the server to store. So the server is going to verify all that stuff. It's going to save the public key and the credential ID for later recall. And then it's going to make a counter and set it to zero, which will be useful later for some reason, which we'll get to in a sec. Okay, with this, registration is done. You can tell the user, hey, you're done. Good to go. Go have fun. Later, the user comes back. It's time to authenticate. Let's take a look at this one again. So... The user goes, probably does an existing method of login, although the spec has some other

cases. We're just going to talk about the boring case everyone's implementing now. They'll log in with their primary authentication, a login and a password probably. And then the server's going to be like, "Hey, you have registered some security keys. I want to verify for sure that you're who you say you are. So let's do the authentication dance." It's going to send over credential IDs, which it can actually send over a whole list of credential IDs. You have one credential ID for every single authenticator that corresponds to that user. You want to be able to store multiple authenticators for users. You also generate yet another challenge. You generate a lot of challenges server-side. Send it on

over. The user and their client's going to do another set of verifications. It's going to use that credential ID. I dropped ID from the slide. It's going to use the credential ID to find the private key that corresponds to, you know, that it wants to use to do the signing. It's going to increment its own internal counter. which is a little strange on the first time you've authenticated because you're just incrementing an arbitrary counter. You haven't actually told it to the server yet, but it doesn't need to know yet. And you're going to sign off the challenge and sign the whole big old blob. So we're going to send tha