
uh i'd like to introduce ashish patel for meta badger automation imds uh welcome i'll let you get started okay awesome thanks everybody for attending my talk so today we'll be talking about automating imds protection scale as you mentioned as well and then a tool that we wrote called metabadger really interesting name um definitely something that had a lot of thought put into it so who am i i'm a security engineer at brex right now i really like working on a lot of different things involving the cloud security space and infrastructure i think there's a lot of different various areas in this space that could use some automation as well as just overall insight of how you build tooling
and letting robots do things for you to help the overall posture of you know that environment and i really think that like shifting left and moving a lot of configurations into infrastructures code as well as kind of pushing some of these default configurations into a secure state is super important and this is something that i've kind of been passionate about i've investigated things in the past different roles and so forth over the years and then the tooling behind all this right should really be able to enable the software engineers that you're working with and also provide guardrails so how do we actually mitigate these things in a way where they're you know not as empowering with this like friction you
know against the different engineering teams and so forth so doing it in a way that's a little bit more seamless right and so you know why meta badger why metadata in general a lot of folks will ask this question um and really the metadata service is what instances use to actually get all the contacts they need about what they're doing so the role itself that gets pulled onto the instance and then the credentials that the instance is using are all essentially you know used by the instance to then get context for the different things it'll access inside of an aws environment right so you might have an instance that has access to s3 buckets or to rds or other things that
needs to talk to and so it'll actually talk to the imds service that'll get pulled down and then the instance has the context to go ahead and do that there's been a numerous amount of known kind of ssrf vulnerabilities and exploits that have existed in the past around this there's been some pretty popular attacks that have existed before 2019 so those have been prevalent since then and then v2 was actually released in 2019 so that was the updated version of the metadata service that aws released and credential leaking you know from this attack and exploit vector is actually pretty bad because whatever this instance has access to that's essentially what the attacker would be able to then exploit
this is kind of a breakdown of the two different versions of the instance metadata service the first one in origin right is a very simple type of request right so you're just talking making a request up to the metadata service that'll give you back a response and that'll essentially pipe your credentials or whatever else that you're asking for from the service back to it right and there's only one method that's used there just to put methods that's on the instance this will just get asked up to the actual service it'll just give you that back right so it's not doing any off there there's no token there's nothing involved there and this is where you know a lot of
those attacks were able to leverage that particular simple kind of response right to say hey let me just curl the metadata service that'll come back to you v2 right and added a few things in place to make this a lot more difficult right you have a session oriented we're moving from just a simple response request to a session oriented request right so now we're going to say there's a token in place we're going to ask for that token when it comes in and then we're also going to enforce a couple other things wrong that right so a lot of the times these ssrf attacks will involve a proxy that's used and they'll have an x4
header so this is saying hey we're going to go through another platform essentially to now ask the instance to give us back those credentials so this actually will you know include other methods as well right so you need to pass a token to the metadata service this will actually pass that back and then you're able to make that call with the token request so we're adding another layer of complexity a little bit more friction for the attacker as well this is kind of a quick diagram here around how the attack would work just with v1 so we have a bad actor right an ssrf vulnerability is present on instance itself right on the front end
and then there's a curl command that somebody may run right once they get access to the actual instance itself so essentially the attacker is tricking the instance into thinking hey this is a normal request that i can make off the bat and talk to the metadata service and essentially you know get creds for whatever the instance is meant to do or what it's supposed to do so these credentials are returned the bad actor now has access to these credentials and essentially whatever context this instance has is what will you know essentially take place inside of that aws environment so they could access maybe you have you know something that the census is doing that's talking to s3
or somewhere else but the entire you know instance profile attached to the role that it has access to we'll go ahead and talk to different things there and so too bad right once we update to v2 this changes a lot right we're creating a lot more friction for the attacker themselves um you can actually see here right that the same attack flow would happen but the attacker may not know that hey i actually need to pass a token now so that's another one one step that you're adding into their attack for that complexity level right um and then if they're able to actually go through and hit the metadata service there's even more complexity because this will
get run returned to the instance um that'll get back to the bad actor but now we have all these other various things that are in place which prevents the attack from some actually being you know taking place in the first place right so the request could get denied without a token that's properly there there's a ttl attached to the actual session right once you pull that token down and then also there's a correct header that's enforcement with the proxy right so if you have if you don't have all those things that are that are part of essentially v2 which now it's like a super complex attack to actually take place you're preventing the attacker from using that um essentially from
outside of the aws environment and this is kind of you know leading us back to why it's important and like how we should go about like solving this problem and you know the configurations for the instances themselves could live in infrastructures code so if you're using terraform if you're using cloud formation and something like that you can actually put that in as just like a configuration to say when this instance comes alive let's actually make sure it's using v2 and just have that enforced there's also like the enforcement of the session tokens right so we're going to ensure that every request to the metadata service is coming in with the token and not just a
simple request that v1 would have before and then also the proxy attack vector right so making sure that that header itself is not allowed so we don't allow something from the outside to actually make that request on behalf of the instance a lot of the times those attacks work that way and then this will keep track of how that request is being made in the first place and you know now we're coming down into the aspect of like hey do we need a tool for this and like every good security person who's doing anything right you'll probably try to google and see if there's a tool that exists for something right um and what we did initially was
try to figure out hey you know how do we get metrics around this how do we understand where the problem exists and you can actually do a couple things today where you can look at cloud watch metrics and the ec2 api to understand where your metadata is being used and which version you're on that exists out of the box inside of aws today so if you wanted to go into cloudwatch actually pull those down you could and then we really wanted to understand hey can we also make sure that every new instance that comes alive is using v2 off the bat and this can get really tricky once you're looking at you know accounts that have hundreds of
thousands of instances across like multiple regions and all these instances are doing different things across the board right so that's these are all kind of like problem sets in within the tool itself of what we needed to think through and you know just like from the preventative perspective of things right you could actually put in the control policy right so aws has a concept of service control policies where you can actually just enforce certain things off the bat so this thing will actually update the iem policy and say hey you know make sure every request is coming through that's assigned to v2 and if not don't let it pass this could really break things pretty easily because some
things just don't support v2 out of the box so you have to be very careful when you're doing that and then i think the last note there is just kind of like around enforcing this across every single instance so you might have a lot of legacy infrastructure in your environment that only supports a v1 that may not have been updated in the past so forth so you have to be kind of careful about this right so we needed a tool to solve all these problems and it's kind of what we started baking into the the kind of the meta badger itself this is a simple like discovery process around how you'd go about solving this
um just in scale and in general right so you look at some of these existing metrics that you have and then even logging for like cloudtrail to see hey which version of the metadata service am i using and then how am i doing that right is user data being passed in is there a role being used are you running any software that could you know be for one version or the other and sometimes there are there's a there's a known list of vendors out there right that like only support one version or maybe they haven't updated to v2 yet because it's actually a change in the code of how they're talking to the metadata service
and the real test right after that is like once you do make the changes you should really update some of those things and actually you know test them out in lower environments right so if you have devin stage make sure that everything's working as expected because sometimes you might have calls that fail you might have applications that start breaking and then validating the logging right so once you get all that done you can look at logs to say hey is the metadata service still function as expected are these instances still doing what they're supposed to do this is the tool itself right the tool solves a number of different things um we really wanted to understand hey like
which version of imds are we on right to be able to pull those metrics down check the the role attachments to see like hey you know if you're using an im instance profile right with the role you're probably using the metadata service to a degree and then have like a lot of different various uh abilities about the metadata service itself so let's enable it to be able to disable it check it and then even roll back if in case we break something and we want to fix this so building all this into the tool i think was really helpful and then even being granular about certain things right so like let's say you're in your
environment and you know instances for facts can't actually you know be updated you can actually specify a tag where you want to apply this across the board you can even specify like the csv of instance ids where you want to apply this so giving folks you know the ability to do that in a very granular manner so you're not going to break things off that you know just have that sort of confidence that you can go into it and sort of fix the problem without running into larger issues because that can get really tricky this is the flow path of kind of how you'd use mounted battery like all the you know all the way through and
globbing all the steps and so forth you would essentially do the whole discovery process right understanding your usage and then seeing like what roles you use to get some context on that as well as kind of figuring out where you're not using the imds service because if you're not using it it's probably best to just disable the attack surface altogether this is just a vector that somebody could use right so if you're not using it just turn it off and then if you wanted to update you can also do like a dry run of this update to v2 and then validate that all those workloads are working correctly and then on top of that the tool will
actually go through get all the coverage metrics you can do an analysis of like hey what impact do we change here how many more instances are down using v2 versus v1 and then continuously audit this so this is kind of a quick demo around like how the tool would work right so we just run a meta badger discover metadata this will go through and give you like a breakdown of how many instances are on v1 versus v2 so we have zero percent there we're doing some discovery of roles right to see oh yeah maybe some of these instances are using roles and then actually going through and updating all these and the tool will just go through
and scroll and kind of hit the ec2 api and update those to say only use v2 on those instances and then yeah that's kind of it and then after that you can run the discover metadata command again and then it'll give you the metric on that and just a quick you know kind of reference on here this this is kind of where the code lives itself um that's my twitter handle and then i wanted to just give a quick shout out to canard so canard wrote cloudsplaining and a couple other cloud security tools as well super neat and helped with a lot of like these problems that's what i was thinking through along the way uh so
definitely encourage folks to use a tool and if you have any questions please reach out to me and i'm definitely you know excited to see this come to life and i've seen so many folks downloaded over the last year or so and also i think it's kind of a problem that exists across the board there but yeah thanks for your time there's any questions [Applause]
yeah so like kind of going back to the scp thing you could probably set that at the top level and then enforce it off the bat but if you have like stuff that's running on v1 you'd have to go in there and manually just make the change cool anybody else
all right awesome thanks thanks again [Applause]