
hello everyone i'm sorry i can't be there in person in newcastle today i really wanted to be i haven't been to newcastle in many years uh if yeah don't let the action confuse you i am actually a yorkshire boy through and through i've spent my life in other places as well but generally come back to the north although i'm in london right now so maybe we'll see we'll see how this goes alright so today i'm going to talk through uh a day i spent an old company a couple of years ago where we looked at what red teaming looks like in the cloud environment what can you use red team for how can you use that to really drive
culture now the company i'm at now in the company i was at back then they were both consultancies and where this came in was for us we go out we work with businesses all over and we're always trying to improve things where we are and also at the end of the day we can't be a liability going into clients so how can we make it so everyone in delivery at the consultancy we're able to look at things with a security mindset with the right lens with the right view on the world because a lot of them were application developers really a lot of them yeah we needed all of them to up their security game as we went along so how
can you do that how can you make change happen in a business like that and i'm going to start off with probably the only thing i guess if there's one thing you remember from today i'd like it to be this because i think it's the most fun part hopefully you remember the other stuff as well but there's this amazing word in japanese called tsujigiri and after giving this talk a few times no one's ever known what this means straight away i'd obviously i can't be there in person i can't do if anyone knew it today but what it means is trying out a new samurai sword on a random passerby truly an amazing word and
really this was kind of the ethos that we took into the day and how this all links together will become more apparent as we go through unfortunately i have to do an obligatory about me slides which i apologize i'm reservably i'll drawing very quickly but who am i i'm josh armitage what do i do i'm the adress practice lead at container here in london and why i should listen to you about security the best answer i've got for that as i spent 13 years in a prison colony um you may know it as australia but i still think that counts also i am i know working heavily towards being an o'reilly author as well author as well
so i'm writing a gladiator security cookbook there's an early release out on o'reilly learning as of last night as it happens this is very well timed so if you think at the end of this that potentially what i say is interesting or informative or other things like that yeah you can go and check out the book you can find me on all the socials all that kind of good stuff so in security a lot of the time always talk about shift left you know you shift shift security to the left you play earlier and you in your city pipeline you want to do that and i kind of walked that concept a little bit with this where
um talking about shift left learning how can you bring learning forward i was i saw a quote the other day i think from kelsey hightower saying there's no compression algorithm for for experience and when it comes to security how can you give people the experience they need in a safe way you don't want everyone to have to learn everything when the security issue goes off at three o'clock in the morning and everyone gets pulled into the sock right instead how can you bring this forward how can you make it so they learn now and i think everyone's seen this graph or similar ones where you see that over time cost exponentially grows the the later you leave it to get this
learning in and understand how your security posture works what you should do in particular scenarios the later you learn it the more expensive it's going to be right you don't want people learning everything about your security process when it's when a real security incident happens that's the worst time to learn because that's the time of highest risk and highest cost so how can you bring that forward and we're just going to step through a few mental models or approaches i think help frame everything else so the first one is from uh us secretary of defense i want to say i got the title right donald rumsfeld where he split kind of knowledge into three different buckets
you've got your known knowns you've got your known unknowns and you've got your unknown unknowns and we'll just step through those quickly now so the the known knowns are the things you know you know i mean seems almost uh redundant on the face of it but there are things you know you know right and in in this kind of scenario everyone security is everyone's responsibility right everyone should know that it's unknown for the entire company it's not the security functions job to do security for everyone one it doesn't scale and two if we i think we all know that generally the weakest link in any any business of security is the people unfortunately and it really needs to be this ethos of
everyone's responsibility and the consultancy i was at the time this was the approach that we took everyone's responsible for security it's not just one person's job or a team's job or anything else nearly every business has an incident response process i always get curious when thinking about how long it's been since someone read that often maybe they read it on onboarding do they ever look at it again maybe if they've got mandated training every 12 months but all that normally all they've ever done is just read it off a page in confluence or wherever sharepoint wherever your documents are stored there normally is one people maybe read it once or at least they check a box saying
they've read it right and in this this is kind of an interesting bit this next bit because this was done pre-pandemic but this was a consultancy so by nature we were a distributed team across the city we were operating in at the time so people were people were everywhere and for the red teaming event that we ran we didn't want to add the complication of having to communicate asynchronously over slack and everything else we wanted people to be together so we didn't want to test that our communication pathways were sufficient right now we wanted to test other things right if you take scientific method you want to control variables and only change one thing at once not add a whole
heap of different variables in the equation because then you can't figure out what happens and what the problems are and then you can't really move forward so in this case for the consultancy we actually we called it a company day and we pulled everyone on site together again pretty pandemic but that was something that we decided to make easier for the surprise blue team another thing we thought we knew was the expected avenues so if we start messing with address environment we thought we knew how people would react where they go what they try and do in the first instance we thought we knew that yes and when i get through the end of the talk
we'll kind of cover off whether we were right about that or not okay so moving on to known unknowns so these are the things we know we don't know in this space so what were the things we were trying to learn what were the questions that we knew of that we wanted answers to and when it comes to something like this how do you measure performance and this isn't about measuring whether you did good or bad that's completely inconsequential for this it's not about grading performance it's about getting a baseline so you can say next time we've got better or next time we've got worse it's not about carrot or stick how do you measure the imp like the
response to a red team activity especially one that they didn't know was going to happen we picked three metrics for this uh first was time to identify how long until they realized something's going on time to contain how long until uh those of us having all the fun on the red team lost all our access and the problem wasn't getting any worse at that point and lastly percentage of intrusion detected so of all the things we did in address all the resources we spun up all the things we did all over the place what percentage of those were they able to find because when it comes to the cloud and when you talk about us when or gcp or
azure alibaba or any cloud you want the thing is with the flexibility and the power that things can get out of control fast right like i can spin up things in regions and accounts and all over the place and we did so when it comes to this is do you have the right tools in place to even find everything that people did right can you even find your way back to understanding how bad the security incident actually is another question we wanted answered was who's going to take responsibility so when we planned this out and it was planned with the leadership team and a couple of chosen people of which i was one i like to think they chose me for
technical skills it may just because i can be somewhat sadistic and they knew that i would take this maybe a little bit too seriously and see how far we could go with this but we took effectively all the senior leadership out of the rich and this was another thing we wanted to understand is without you know the normal people without i mean i think everyone normally knows there's a there's a personal people everyone turns to and one of the things we want to test is what happens if none of them are there who's going to take up responsibility who's going to take charge in that scenario because we had a mixed experience level junior more
senior and everything else so this is something we wanted to understand as well and another thing that generally happens is whenever you scale the process to looking after three times what was initially designed for it generally tends to break now the consultancy at the time when i joined it was about 10 people and when we did this it was about 30 and we had it in the back of our minds that the process that we had that served us very well about 12 months before this would it still hold with the company at three times the size that it was and the last one is just for me is to understand and i come back
to this concept quite a lot is what lenses do people possess now in in my career i've worked in a whole variety of different fields including security so i feel i can look at problems in different ways i always like to understand how do people look at problems because that will help you figure out what solutions they're going to kind of drive towards like application developers generally try and look at does the thing work how can i make the new thing security generally look on effective security problems and operations look how do i keep this running in production and one of the important things to do with people is you want you know t shaped people that have a wide range of
skills not they have their deep areas as well but you want them to have a wide range of skills so when they look at the problem they can see from different viewpoints and this was to understand do were there any people in the business who had a security lens that we didn't know about were able to look at the problem in the way that we wanted them to automatically or is there a massive gap here and we need to do more and unknown unknowns obviously we couldn't know these in advance because this is the idea of the things you don't know you don't know yet if you don't know you don't know it you can't plan what the
question's going to be so we knew by doing this that we would turn up things that we didn't expect and that's part of the excitement going into it right yet you get your business value based on the things you know you don't know and then you'll find out more things as you go along by taking action by making it real by building your culture through action you can really you can uncover some things right you can't just sit there and think about it and actually learn what you actually get through real action and the same concept exists in the operation sphere with chaos engineering you don't know how your system is going to react in a certain scenario so you
make the scenario happen and see what happens because you can't think your way through it it's it things are too complex nowadays to be able to think your way through the problem entirely a big suggestion if you want to do this yourself is set up some rules of engagement so we had a a few ground rules that were set for the red team uh one of them was cost you don't we could only uh spend so much money on aws like when you have the cloud and all of a sudden you can run up virtual machines that cost an insane amount of money like it can quickly spiral and that we weren't trying to bankrupt the company
we're just trying to learn things so there were some limits put in place also there were things that me and my partner in crime for this will that we were blocked from being able to do like instantly you can go and we could destroy identity and permissions and everything else and effectively just make it so blue team could theoretically do nothing not fun kind of detraction the purpose so there were limits to what we were allowed to do we had some ideas that like oh this would be awesome but it kind of defeats the point so setting up these rules of engagements this is about having a healthy tension between the teams because this is a learning exercise so
you want people to be learnings you need to give some to and fro some ability to go back and forth and some flexibility in there not just everyone wins or no one wins on this it wasn't a competition it was how can we do the best for everyone and now i've potentially brought you all to tears with mental models and preparation let's actually step into the day and what happens i'm going to run through a timeline of what happens on the day just so you get an idea so we kicked off at 10 a.m we as i said before we called everyone together in the company to be co-located together in the office and myself will and their three
directors made an excuse to oh we've got to go look at something we'll be back all we did was walk next door to the coffee shop and just open laptops and set up and get ready to go but we removed ourselves physically from there because otherwise it gets pretty obvious if something's going on and you're sitting there in a corner hacking away and everyone's like why aren't you helping it's like um maybe because it's me that's doing it another thing we did with this and i i mentioned before that there was a variety of seniority in the room going from junior to senior one of the things we wanted to do as part of this is make sure the juniors
got involved not that they wouldn't want to get involved but not allow the seniors to kind of take control and fix things without including other people so when we did this when we looked at whose permission to take away we took away all the senior team all the senior members permissions to be able to do anything and left all the junior members with permission and the idea was that the seniors then had to channel what they were thinking their approach what they wanted to do through the more junior people and really kind of a pairing approach on this i think that's really critical because otherwise the people that already know a lot retain that knowledge and it doesn't get
spread so at three minutes past 10 we tripped a wire so this being our own environment we didn't have to do a bunch of os sins but effectively we knew we could do things that would cause alerts to start going off so people would start to understand that things were happening in this particular consultancy we were a serverless first consultancy so what we had was if anyone ran up an ec2 machine anywhere an alarm would go off because generally wasn't any reason for an ec2 machine to be alive in any of our discounts so we did that and we knew that was there and again this comes back to that this is about learning not about
winning it's not a competition so you know give people a chance to fight back give the blue team a chance because at this point they don't even know they're a blue team they're all working on other projects and you know things for the day they don't know this is happening yet also at 1003 we decided to fish them as well so we had compliance tooling on all ios accounts we'd already pre-made an email to send out to see if we could get people to click it and seeing that this company was effectively 95 engineers you would really hope that no one clicked it a few people clicked it and they went through so this was also interesting these are
kind of things like table stakes things that you think that people would know better at the end of the day fishing is unfortunately effective even even in 2021 so six minutes after we triggered the alarm and six minutes after we sent the fit the phishing email as well someone smells smoke so at this point someone noticed they put on slack saying hey i think there's something fishy going on can people down tools can someone come have a um have a look with me paul the the guy who realized this was one of the more senior people and his permissions had already been uh removed so he couldn't get in a minute later someone went to help him
and we instantly blasted her credentials as well and you might ask how did we know what was going on because we were no longer in the room we were next door we had a fly on the wall so we had someone on the inside who knew what was going to happen knew it was happening and was there to relay information back to us of what was happening in the room because again we didn't want this to be too stressful we wanted it the jeopardy to feel real enough but not everyone to stress out of their minds to go crazy and everything else so having again that ability to go back and forth is critical and know that you're
not pushing too hard and giving that two and four aspect to it so that's we had a spy in their midst which was quite amusing so three minutes after someone thought something fishy was going on paul sent out a company wide blast going okay not a drill something is actually happening area discounts i need everyone to stop what they're doing and we need to mob around this five minutes after that the ceo of the company the founder was sitting across the table from me quite happily drinking a coffee got a phone call from uh one of i think it was for paul and i got to see him have a massive groom in his face and turn his phone
upside down and just sit there and do absolutely nothing the interesting thing about this was this is step one of our instant response plan so even with a well-defined interest response plan people did other things before they went in read it and did the first thing on there right so this is one of the ways you want to understand what's going on one of the interesting things another thing we hadn't really thought about too much before was how would the blue team self-organize organically they were going to arrive at some kind of structure of how they were going to try and approach and tackle the problem what was that really going to look like what we ended up seeing was effectively
this so you had paul at the top and what ended up being about 15 people i think on the day underneath him all feeding information back to him you can probably imagine what happened effectively his head set on fire there was too much information coming at him he couldn't handle anything there was no like clear direction and everything else so 11 minutes after that phone call was placed they had a wait no we need to stop and regroup we need to figure out we need to get an idea of what's actually happening we need to marshal resources all those kind of things and at 10 30 they did an access poll so who still has access to our environments
so they can understand how many people they could channel change through who the people who still access so how can they paralyze what they're doing how can they go with the best approach forward what was interesting with this is someone claimed to not have access when he very much still did we hadn't taken away his yet he claimed he didn't have access which is also interesting learning and it's not that this is about punishing people or finding stuff out but it's about the feedback loops now we could go back and go okay well why did you think you didn't have access and it's not like you know you should have owned you should have been like no
yes i have access but how can we use that as a feedback loop to help coach support you know find all these things and we didn't expect anyone to say they didn't have access when they did right but these these unknown unknowns you turn up as you go along so seven minutes later they decided to have a more uh mindful structure so they had a comms role and a leader role so instead of everyone going into one person you had someone in there whose job was to take all the incoming information filter and pass it to the leader who was trying to kind of do the grand plan the strategy and figure out what they were going to
do this served them a lot better at this point be able to filter and everything else but again if we were to do if you were to do this again they would naturally fall into this structure as opposed to the initial structure where it didn't really work but it didn't scale so by doing these things people learn things they get that experience it really moves them forward four minutes after that they kicked out the spy the fly on the wall that we had yeah they're like why aren't you helping and he's like ah i'm like okay you can get out because at this point we had told them that this was us i mean they
most of them had figured it out for themselves that this wasn't a real attack but it was an attack being portrayed from within and you know we told them because again we didn't want the stress levels to go too high we wanted it to be a learning experience so yeah we we message them saying hey but treat it as if it's real and also they now knew who was on the other side and consultants are on the whole generally a competitive vlog so if anything this might have even amped them up a little bit more in terms of trying to get one over on us stress levels went down which was what we were trying to drive for but
that was something quite important on the day that we wanted to make sure that they were aware seven minutes after that i managed to find github tokens that were unencrypted and stored so we managed to break out of aws fantastic uh now i was off in github creating private replays to not be too cool for the guy who'd accidentally left his credentials lying around but you know this is all those kind of things that when you're in the cloud and you're kind of trying to go cloud native you think that you know you think about everything in the world garden and sometimes people don't think about okay well if i don't do this property to s
you know the problem moves i think from a security angle quite used to this kind of idea that you get a beach head and you can move wider from there but sometimes in the cloud it can be something that people don't think about and really what this comes down to and something that i always talk about probably too much at length is the core components of 80s and one service that is amazing but it's probably not something that most people get familiar enough with and that's kms like the key management service and it's kms and gcp and it's kms and there's kms in every cloud they all use the same title for this one which is nice and easy
but effectively he'd encrypted his github credentials and kms i hadn't set the key policy correctly the resource policy on the key so it's encrypted at rest but anyone in the account could get the credentials back out and this became a learning moment he's like oh i thought i'd done the right thing can you help me with this later so this was you know an option to find it and then go okay cool no this is how you want to do commerce so you can only the right people can access it and stuff like that and only you can access your credentials so you find these little teachable moments and done with the right mindset people aren't trying to hide away from
the fact they did something wrong but they're trying to approach it with this learning mindset this growth mindset that's so critical so two minutes after i broke out of aws all silent on the western front they decided that if they were going to try and uh combat us they probably shouldn't be putting everything they're trying to do on slack where the red team can see it so we could see what they were thinking about what they were trying to do and then we could always stay one step ahead right so at that point they all went into google hangout i think it was i started using the chat in there because that was just something simple to set up
so at this point uh we'd lost the visibility of slack our flying the wall had been sent out and we're like okay we don't know what they're doing anymore so as the red team respected we've got no idea what they're doing to find combat as we don't know how things are going there we don't know whether to push harder pull back or any or anything else because we always try to stay only a bit in front not like hugely in front only ever just like bits they could chase and feel like they're making progress towards you at that point you throw the co and the cto and back into the room because they can't kick them out and
they can still feed us some amount of information so we can find out what's going on and also the paul who'd been leading it all had been pretty stressed out by it as you can imagine and you know we wanted to go okay and put the cto is going to come back in he can take over running that if you want giving options it kind of de-stress a little bit because although we now knew it wasn't real i think the stress peak you know those kind of things give people the option to kind of swap out and everything else and three minutes after that the blue team said that yeah we think it's all under control we don't think we have
access anymore we think we've got it contained they had not so uh will and i stepped up our game a little bit we started creating some things with uh names appropriately so they would know that these are new and everything else and you know 18 minutes later they came to the conclusion that no it was not over it was still going and five minutes after that will and i decided to send some photos on slack uh of us very happily hacking away next door just to kind of you know make sure that we're all focusing on this as being fun and learning not something super serious and eight minutes after that uh we got they actually did containers
they got everything that we're doing all our access was revoked and everything else so you know they've actually managed to get on top of it and 22 minutes after that uh will and i decided to just chill for a little bit and then went back and we went back into the room and if anyone if you've never walked into a room and been flipped off by everyone in the room at once it's quite an experience and that was quite amusing to me at the time um but we returned to the room and then it went from okay the red team part is done we need to move containment to remediation on this so all in all we're
talking about this is two hours of some amount of pre-work investment in deciding what we're going to do and everything else the actual scenario itself took about two hours and we'd set a time limit on this as well we weren't gonna run past one o'clock from memory i think we'd said three hours max if they haven't contained it by three we're just going to stop and we're just going to move to remediation because we wanted to make sure we got through the entire life cycle not slowly just keeping going on the same track and me being a massive lean theory junkie one of the things always trying to drive in these things is the concept of kaizen
change for good or continuous improvement the most common english translation so when i say that this is about improving and everything else i'm going to talk the specific numbers i mean you may have been able to follow some of the some of the numbers as we went along but again i always want to reiterate this isn't i don't know whether these numbers are good or bad i just know that's where we were at that point in time and now we have something we can improve on so the time to identify so from us doing something to people considering it an active security incident was 12 minutes the time to contain was an hour and 28 minutes
and when it got down to it the percentage of the intrusion detected the amount of stuff they found after a couple of hours of remediation and just trying to get on top of things we gave them some time to try and do cleanup it's about 66 so they found about two-thirds of the stuff we did so you know uh given they had twice as much time as we did to get on top of it they still only managed to find two-thirds and i think it was kind of paredo principling a little bit i think even with another couple of hours that number probably moved to 75 or something because the things that were stopping them from finding stuff now where they
weren't going down the right thought paths and this was with me and will in the room trying to trying to lead them gently down the path we didn't go oh we did this here we went have you thought about this and tried to see if people could arrive at the right path themselves to go and find what we did and also one of the things you want to do in this is you want to keep a running tally of everything you did do which we did in slacks everything we did we made a tally so then we we knew we could go back and find it so we didn't leave things lying around for the rest of time
because no one could find it one of the interesting uh things that came out of this was one of the first questions was but is this realistic which i found really interesting because we put quite a bit of thought and effort into making sure that it was a realistic scenario and effectively this was could have all been done off one set of compromise credentials effectively so i thought it was a very realistic scenario but for the other people like oh but this wouldn't really happen it's like yes it absolutely could we try to keep this realistic i think sometimes with security and that i imagine most people in the room and online have done this like security
thankfully most people haven't had to go through real security incidents in in businesses right so sometimes it's not real to them until it is i think once you go through one then you you get it right and that again that's almost the point of this exercise that we did was make it real now as opposed to waiting until it's actually real because the the end thing about security is it's not if an incident happens it's when an incident happens and i always come back to backups never fail restores do if all you ever do is back up machines and you never restore them uh that's never really the problem right if if you do backups but they're
unrestorable doing the backup is relevant so having an instant response process and having all this stuff in in place is great but if you never test it if you never prove it out if you never make it real then it's not worth all that much and fundamentally what i took away from this and where i kind of got my idea around what the difference between the red team and the blue team were and what we needed to do going forward to enable the blue team to fight back harder and faster came back to 42nd boyd john boyd the famous pilot and part of the air force in the us and he has this concept of an ooda loop
um and yeah comes from the idea of dog fighting in airplanes and the idea is this loop of observe orient decide and act and if you can cycle through these four quicker than your opponents or the other side you can outmaneuver them you can move faster than them and you will win and really when it came down to it we didn't have tools in place to do to observe to be able to see what's going on this observability really came down to just there wasn't enough tooling in place so the red team were off doing things and the blue team just couldn't see what was happening and because i couldn't see what was happening they couldn't orient they
couldn't decide they couldn't hack and the amount of time it took them to observe what was happening by the time they got through that we'd already gone into more things so they're always operating on stale information and that really was for me the fundamental like learning that i took away is okay what can we do to improve this how can we give people the observability to understand what's going on across all our environments and you know as we talked as we worked as we identified things we made we were capturing improvements as we go along things we wanted to do things that we thought were good ideas based on what we learned from this and we slap them all into jira because
what else do you do with good intentions and good ideas and what this became was a living backlog as a consultancy our bench capacity anyone on bench who didn't have something very specific they needed to do just worked away at this backlog and that was just how he approached things and that's how we just day after day week after week we improved on where we were at one of the interesting things that came up came of this it was ended up being the beginning of a tradition at the company like i'm no longer with that company and neither is will for for well legitimate reasons at the time but this is still something that's talked about
it's still something that new people come into the company hear about and learn about it's become one of those cornerstone myths of the culture of the company this is something that will live with the company for as long as the company exists it really made a focus on security to take their parents for security as job zero effectively the company made the investment of we're going to have no revenue for a day in order to make security important in order to make people learn and make people realize how important security is to us we're actually as a consultancy we that we there was no available time that day that's the level of investment we made
and you know it's not for a giant enterprise they can't just not have anyone work for a day right but really it's hard to tell people security is important if the investment isn't there right if the all the investment goes into the security function but the teams are never there's never a trade-off against delivery and it's just one of those things that really needs to happen and really makes it real to people makes people sit up and pay attention i'm just going to end off with just why red team if you were going to sell this or when i do sell this into businesses and raise it up a lot as something important is i'm gonna use a chess
idea just to finish it off so the difference between a novice and a master so a novice chess player looks at the board and sees on their side 16 pieces and there's the opposing 16 pieces and the way they look at the board is all about the individual pieces moving around when you get to a master they look at the board in patterns in groups and chunks so they're able to move more quickly because they already have experience and they're reusing that experience when it comes down to it and really at the end of the day if a real security incident happens in your business would you rather have novices around you or would you rather have masters
i think there's only one answer to that question and that's all i had hopefully that was good for people uh i am i think around somewhere in gather town for questions or if someone is able to get the questions to me then i think i believe i have five minutes left for any questions or find me on twitter or linkedin or whatever you want and come and ask questions later i'm always happy to talk about this stuff
awesome it's working yep
yeah if there's no questions that's perfectly fine um
oh good well i hope everyone has a wonderful conference and a wonderful weekend and for all those in newcastle uh have a lot of fun