← All talks

Your Secrets are Showing! How to Find if Your Developers are Leaking Secrets?

BSidesSF · 201824:16299 viewsPublished 2018-04Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
TeamRed
StyleTalk
Mentioned in this talk
About this talk
Developers routinely leave secrets scattered across systems—SSH keys, AWS credentials, API tokens, and configuration files—that attackers can exploit after gaining initial access. This talk maps the post-exploitation reconnaissance playbook: harvesting credentials from shell history, SSH configs, and git repositories; pivoting through discovered hosts; and establishing persistence. Practical defenses include passphrases on keys, secrets management tools, and developer training.
Show original YouTube description
Ian Lee - Your Secrets are Showing! - How to Find if Your Developers are Leaking Secrets? This talk will zoom in to the cache of goodies which developers leave lying around that an attacker could leverage access valuable information and / or to pivot through a target environment. It will also highlight some of the tools available to developers and InfoSec professionals to find and prevent these sorts of information leakages. Every day, developers interact with a variety of source code repositories and environments, often both inside their corporate firewalls and outside on public hosting platforms such as GitHub.com and Amazon AWS. These source code repositories can provide a wealth of information about a target environment, in addition to being of potential value all on their own. Are you able to find this information in your environment? Do you know how to help your developers prevent these leakages in the first place? Remember "prevention is ideal, but detection is a must!" Prepared by LLNL under Contract DE-AC52-07NA27344.
Show transcript [en]

[Music]

thank you very much so my name I work over at Lawrence Livermore National Lab out in the East Bay here in the Bay Area I'm a computer engineer there I work in our high-performance computing center which houses some of the largest supercomputers in the world this is a picture of Sierra which is the assistant in the process of sighting now which is estimated to be about 120 125 petaflop computer by the time it's installed later this year I'm also an open-source evangelist and and run some of the open source work that Lawrence Livermore does internally and in collaboration with the Department of Energy and across the US government as a whole so recently in the news there

have been a lot of data breaches there been a lot of exposures about Secrets recently there was the Panera Bread there have been other vendors that have recently been hit and have been had some secrets had some leaks had some information get out the door and so how do these things happen well back during world war ii there was this comic you know don't discuss your secrets on the telephone perhaps now it's time for an updating we could say don't discuss your secrets on the internet instead and the reality is though that we're actually putting more and more of our secrets out there so with the prevalence of code hosting tools like github gitlab bitbucket org

the bucket services in general as well as other cloud hosting services like Amazon Web Google cloud and Microsoft Azure we're seeing more and more source code and configuration move out of the internal environment and into an external environments and sometimes this can lead to problems so recently this is just one particular example where AWS secret keys were leaked as part of a data compromise so if some endpoint was not being properly protected it allowed access to view the actual source code of some internal application where the secret key was embedded and there was a talk earlier today talking about how to find some of these these in your codes in general and and some of the issues

with managing keys in mobile applications so if you just take a look at for instance the AWS secret access key and you throw it into github you'll actually find almost 400,000 instances where this shows up in store code that's out there on the internet today now a lot of times it's it's benign it's some sort of placeholder text there is no actual value there but that's not always the case and if you look at the version control history of some of these tools you don't have to go very many pages through the search results to find things that are actually the access key and secret access key for real production and staging servers on github now one of the issues that we'll

talk about in more detail is that get being a distributed version control system being something where you're pushing out into the cloud even when you try and clean up these issues it doesn't always work quite the way that you expected so it for instance this particular commit that had this particular file in it had been deleted it wasn't available on any of the public branches but it is still being indexed and is still available on github.com because they commit itself hasn't actually been deleted only the reference to the commitment so instead of hard-coding these secrets into your source code into developer environments what happens well we end up moving a lot of these access keys and access tokens

into our environments so these are either shell environments or some other environment that we might get through vaults or some other system and it ends up being that we have code that goes out into the environment looks and reads the key and kind of dynamically at runtime and this is great this is a this is much better than putting the secrets hard-coded into your applications but how does this play out during a penetration test so my role at Lawrence Livermore is to actually conduct penetration tests against our HPC systems and one of the things that unique is unique about us is we actually invite users in to come and run on our systems and a lot of the the techniques

that are developed for single you know developers running applications on their laptop or developing an application on their laptop with things like ipython notebooks or Jupiter notebooks and other services like this don't work so well when there is potentially a hundred other users running on the same system that you are so we actually provide this as a service to our customers so the concerns are a little bit different so in penetration test what will happen well one of the the kind of endpoints or close to the end point of a penetration test is in the post exploitation where you've gotten some shell on the system again this is often the starting point for our penetration tests where if we

got on the system what would we be able to find that we wouldn't want unauthorized users to be able to access so if you got a shell on a system you might reach out and try and find out some information about who you are in that system in this case I'm just using a vagrant virtual environment that I created for this talk so you might friends men buy information about your environment you might look at the network configuration you might try and see what kind of privileges you have so in this case looking at do I have any sort of root access and in this case you actually do this user has root privilege without use of the password which is

pretty bad but that's a vagrant default behavior and at this point we can declare victory right where we're done we have root on a system we can write our report we can move on well if we really want to try and provide more value for our either our customers or users that might be using some platform where we invite them in like this we actually want to dig a little bit deeper so let's go back to where our users were setting up these these tokens to read things out of the environment so our environment is a very Linux heavy environment and we actually did a lot of the early work on Linux systems a lot of

the HPC systems that we run were the first ones using Linux as the base operating system back in the 90s so we actually have a lot of experience with Linux as our operating system and one of the ways that you might see this come into your environment is through the way of setting environment variables so here you can see the AWS access key and maybe of a secret access key if you're on the host you can read out your environment with the export command in a bash shell and grep around for some of the tokens so you might have some in particular that you're looking for that you've found through either code or some other

means that you want to try and find more information about or more specific keys but how do the environment variables actually get there in the first place well they can actually get there a number of different ways one of the ways that they can get there is through the use of shell profiles or environment profiles where you put the tokens or access keys or whatever into your environment profile and this gets read every time you log in so here if I was a penetration tester if I landed if I got exploitation of some developer system I might be able to read out not just the AWS keys that I was originally sort of privy to and trying to find more

information about but here I can actually find gitlab and github API tokens as well another thing that's that's interesting if you're someone who's compromised the developer or system is you can look at the history of what commands that developer is run if this is someone who uses a shell often you'll see a lot of history and you might be able to run some analytics to see what sort of repositories they're talking to what sort of commands they're used to running what sort of hosts they're going out to and a lot of times we'll get on a system as a penetration test and then we'll try and see what other systems we have access to and this is actually a nice

passive way to find extra systems that may be in scope for a test or some other exercise so here you can see there's a couple of git repositories that the users made some effort to maybe there are contributor for these projects we can look and see what what they have in these projects we also see some pretty obvious glaring bad things like here they're using MySQL and they're actually well there's a couple of bad things here one they're using the password directly on the command line which means that it's not saved in the shells history file so in this case root is the password and the fact that it's sort of root as a default login credentials is

another bad thing that we want to try and catch from our developers a bit we might also find other information maybe this is a developer that actually does backups of the system and so we can see things where they're actually dumping out the database information to about well again this could give us more information without actually having to move actively through the environment we don't actually have to connect to a database host in order to find more information about what's in that database in this case this is probably just a plain SQL file that we could go read without any encryption at all this is all well and good but once we have some of this information we want to see

what else we could find maybe we want to move through the environment but we want to disguise ourselves to look like what the developer doing so in a lot of Linux environments a common way to move through them is over SSH the secure shell and the idea here is that you can connect from your host so let's say my laptop onto some other system and you can execute commands there and do other operations there so if I was on the developer system the first place if I was looking for secure shell information is in the home dot SSH directory and there's a couple of things here that you might find that you could leverage pretty

quickly and learn a lot about the environment again without having to move anywhere off of this one particular box so one thing that immediately jumps out is here there's a trio of SSH keys or what are probably SSH keys and you can see the private key is and the public key is with the dot pub at the end of them and additionally we find the configuration file so this is a file where developers might put extra information about their environment host that they connect to often and other things that they might put in as ease of access to some of their hosts so if we dive into that configuration file there's a couple of things this is a

fairly simple looking ssh configuration file and one of the things you can see is there's a host AWS so here they are defining an alias for an AWS host and whenever they typed at ssh AWS they'll end up going to that host as ec2 - user to a particular host name this could be some internal IP address or some other host in this case in the Amazon ec2 cloud and in this case you can see they're using that AWS private key public private key pair that we saw in the previous slide so another thing we can learn is that there's a couple of host DB and web which pretty self-explanatory what those might be and

here they're actually the developer logs in as the root user using a particular production SSH key so maybe this is a particularly strong SSH key but they're logging in as that user doing operations there on that host and for everything else it's going to fall under this host star section so I think any other host that's not one of the three listed above DB web or AWS is going to use this set of configurations and here you can see two things that there assisting any this is H connections they make so if you connect to that host again it'll actually reuse the connection from an earlier connection if they have successfully authenticated and by default they're going to present that

identity file the ID underscore RSA which is the default RSA SSH key format so coming back to those keys a little bit if we look at the files we can actually learn more about the configuration the strength of those SSH keys in this case we can see that the AWS secret key is encrypted so there's some sort of passphrase that was put on this this SSH key it could be week it could be strong ideally we want our developers to use strong passphrase azan their SSH keys but worse than that we actually see that the default SSH key doesn't have any passphrase encryption at all so here it's just the actual private key sitting on the host with the

public key we take those two keys away onto our own attackers box then we could actually leverage them in future explorations through the environment so another file that that's kind of interesting in this SSH configuration area is the known hosts file and it's a little hard to parse here here is just three lines of this file but a couple of highlights and some more information gathering that we can get from it are any of the hosts that we've previously connected to so here we can see we've connected to the DB and the web hosts showing up here and then these are the fingerprints of those actual hosts so these are the public keys for the hosts

that we've connected to and we might find others again inside of that environment where this developer has gone out and made connections to their hosts in the environment that we might want or and see if we can gain access to perhaps using that unencrypted on SSH key on the previous slide so another host here that shows up not unsurprisingly from the bash history earlier is the github comm URL so here's actually the host that's responding whenever we do a git clone operation or a give push pull from github comm and this might allow us again in combination with the SSH keys that we see to actually go out and look and see what else that developer has access to on

github could be public repositories or private repositories that are able to be accessed with sssh configuration so once you're able to access some of these hosts again github.com being one of them maybe some internal servers what do you want to do well you might want to try and actually find source code and dig deeper into some of these configuration issues that you saw before so one example you might look just through the developer's home directory and see what other git repositories here we saw several clues that make us think that this developer uses git regularly so we can actually look through for any of the git repositories that they're using in the system and we'll see that there's a

couple here the cpython and the Django repositories that we would expect from our earlier look at the history file but we can also see there's this other special project directory that we didn't see in the history before and there's a number of reasons why that could be but if we actually look into it perhaps we find some more information maybe we find some source code that's either valuable in and of its own self maybe we're trying to actually prove that we can get to some development source code as the purpose of our penetration test and if we actually look at that source code people will actually find some more secrets or some more information about

hosts that it's going out connecting to and then we might be able to leverage further in our attacking again if we look at this configuration we didn't see it as part of the bash history but perhaps we could actually look at the git configuration for that such a repository sitting on a file system and find more information again about what servers this developer has access to and where they might put their source code so in this case the repository in question this special code a special project git repository is actually hosted on in this case an internal gitlab example.com server so maybe this is inside of the actual intranet rather than out on the public internet on

github.com so that gives us our SSH we've gotten some source code now but one of the main points of exploitation and post exploitation is actually developing persistence so here this isn't actually just something that attackers want to do in fact we actually have developers that want persistence into various environments as well and one of the ways that we see developers do this is through a built-in command or a fairly low-level fairly available command called screen so screen allows you to start a shell session that will persist after you disconnect from the host so imagine I'm a developer I'm working on something from the road I want to be able to get into a server and then when I have to

get off the train I want to be able to close my laptop and move on but I want some long long long-running process on some remote server to continue running in the background continue running on that host so I might start a screen session here perhaps we might find one if we're listing out screens just screen sessions that are already available we might find one in this case labeled DB mayn't so that maybe gives us some hints as to what that screen session might be about and we can try and reconnect to that using the the - R command I will end up in some sort of screen session now typically this is the screen running

on the host where you're executing the command however it's possible that what you find inside of this is actually a connection out to some other host so potentially if I'm a developer working on my laptop I need to go to some gateway system but the code I really wanted to run was on some deeper on the inside internal server so here I've opened up a screen session on the gateway server and then I'm actually connecting from that deeper into the network here into the database server and you can see again going back to that SSH config we're actually landing on this system as the root user so there's a several bad things here that we we can

point out pretty quickly but we can also learn some more about what this this developer has been doing on this system again seeing what commands they've actually been executing here we'll actually see some history of whatever they've been doing in this case seeing not just the command but also the warnings that they got about how they shouldn't have put their password on the command line in the first place so we can get out of this screen session and move on and see what else is in this environment that we could take advantage of but before we leave maybe we want to start our own screen session and make some sort of persistent connection out to our own command and control server so

here we can actually leverage the developers tools themselves to create some persistent connection out to a system that we control as the attacker doing the penetration test and we can disconnect from that screen session and unless the developer actually looks at that system they may not notice that that session is actually still sitting there still running in the background but if you do the listing of the screens you'll actually see that it is still sitting there and able to be reconnected to in the future again running our reverse connection out in the back so just as a recap we have here several user environments that user environment features that we can actually leverage as part of a

penetration test we have things like shell environments profiles and the history of those commands to actually see what's going on inside of a developer system this will allow us to passively gain more information about that environment and what's important to know here is that it may not just be that one developer but similar developers inside of the organization may have similar approaches to doing things and so as we get from one developer machine to another or between different servers we can actually learn more about what's going on in that environment SSH here configurations we can find some some key pairs that we might be able to leverage there's other things that we could dive deeper into like the

persistent connections and how we could actually reuse any of those persistent connections inside of the environment to move laterally once we've gained access to this one host but we can also use it just to find other hosts that might be of interest if we have these other systems that we know are do it or we suspect at least are doing some sort of database operations or web server operations those might be our targets depending on the scope of our penetration test obviously developers source code is very important we can find some internal repositories external repositories and you might find be surprised at what sort of code you find in those histories again when we saw the

example earlier where a commit got pushed to github but not properly cleaned up we can see the same thing sitting on a developer system if some secret got committed into the git repository let's say the developer caught it in advance of pushing it outside of the corporate firewall onto some external server they might not have done the same level of rigor to clean it up on their internal server so if you are on that developers system you might actually be able to look through other parts that they get history to see and find other credentials and password and information in these dangling commits that are not being referenced anymore and again persistence is always important and it's

not just something that attackers worry about developers actually have come up with these tools that we can reuse and live off the land to incorporate into our penetration tests so I don't want to cast a bunch of gloom on developers I myself came from a development background before moving into the cybersecurity side of things and I use these techniques myself in my day-to-day operations but it's important to know what the limits of them are so for instance using SSH key is without any sort of passphrase is just generally not a great idea it solves the I don't want to have to type my password problem but again that's really working around your issue a lot of the code hosting tools

will actually allow you to disable using SSH keys if you wanted to if you make that decision but they can provide value if you do decide to leave it on it's important to train your developers and and show them that how to use pass traces how to do things more securely for instance it may be that you want to advise your developers not to put key pairs everywhere on every system that they're using but instead manage them on one system where they have tighter control over what those keys have access to and how they're actually used and made available to other systems so knowing what's in your history is obviously an important thing one of the

takeaways that we had recently was actually to start monitoring our service account histories so if you have a some service account that's running let's say a web server you might want to actually monitor what that service account is doing as part of its operations if it's getting a lot of shell history if it's getting a lot of command execution that might be a sign that it's not just the normal maintenance that your your experience they're used to experiencing or that you your developer are doing but instead that someone has gotten on to that system isn't taking advantage of some vulnerability in some application and it's executing commands this is a way to find what's actually being done

and I'll close out here by saying you know there's there's many good static code analysis tools that go along with this so if you're looking at your source code for your your corporate internal repositories it's good to incorporate tools here I put out just a couple of links for the the O asif as a list of tools by language for what to use myself personally I'm mainly a Python developer so I put bandit here as one of the tools that I incorporated in my workflow but I think one of the takeaways that I want to emphasize is that just having static code analysis isn't enough on even in and of itself and instead you actually won't have

version control aware code analysis so there's a couple of tools ones out by 18f which is a sub agency of the General Services Administration in DC and AWS labs both have tools that will go through your git history looking at all the commits in your repositories for these same sorts of issues so it's not just I have looked for the use of password equals thing in my continuous integration tests but actually go through the history this is particularly important if you're moving from having code hosted internally to code hosting externally before you push your code out there you want to make sure that everything in the history is also good and it's not just good today sometimes

you can find secrets like this in your source code if depending on what kinds of things they are you might actually decide that instead of actually pushing an entire git repository out from inside our firewall to outside our firewall what we instead want to do is actually trim off just the top commits make a new git repository that's brand new we've know we've checked out and there's nothing in the history so there's a couple of concerns to trade-off there and at that point I'll take any questions [Applause]