← All talks

Tom Webb - Incident Response Awakens

BSides Augusta · 201622:5831 viewsPublished 2016-09Watch on YouTube ↗
Speakers
Tags
StyleTalk
Mentioned in this talk
Protocols
About this talk
Video from BSidesAugusta 2016.
Show transcript [en]

you

thank you so today we will talk about instant response awakens and we're going to talk about common problems that I our teams face and we're also going to talk about the actions that our group took to deal with these how many people as runs runs a sock manages of socks as an analyst has anything to do with any of that okay great so we got a good bit of people here so about me so I started off working in state law enforcement did that for five years and got bored with that so I decided to go to university where there's lots and lots of incidents so it's a lot of fun I love it i've been

there for 11 years I've done everything from just being a general I our guide to security architect for the University and now I manage our sock and I'm a tier 3 analysts when when I need to be against Trump Center handling for three years and I'm GSC number 76 so what is our environment look like and why do we have some of these challenges so we have approximately forty-two thousand people enrolled 18,000 faculty staff eight campuses distributed throughout state and we have distributed IT everywhere we have departments that have their own aight see we have sub departments of the department that have their own IT we have 10 to 12 active directories we have different exchange servers everywhere

and we only managed centrally about a 3000 desktops so everybody else has either completely unmanaged are managed by different people so it's it's quite interesting and fun trying to secure this type of environment so what are the challenges that typically everybody runs into right staffing we all have staffing problems slow response time so from the time and alert or an alarm generates how quickly are we picking it up slow time to collect the slow analysis and distributed environment how do we get tools and things in place to help us speed these things up and then a lot of times we respond the stuff when we really shouldn't have but we didn't know until we start digging into it so we had

an opportunity for change there is a department of revenue breach at the end of 22 12 and as more and more came out about it legislature was really upset about it and we saw the writing on the wall for new statutes come down to make us require them and we already have several things in place that we wanted to get done and we use this opportunity to say hey now is a great time before these laws come into place to start putting this in effect and so we had a two year project we implemented a lot of new technologies the first thing we did was implemented sim and full packet capture we also had a remote Incident Response

tool and we implemented a data discovery system and then to help kind of bring it all in and allow us to get these things installed we had we developed some minimal security standards where everybody on campus is supposed to have our data discovery tool install and then systems that have sensitive data have to have our remote instant response to install so new staff we were fortunate enough to be able to double our staff we added two more incident response people so we had now the staff of five we have we had two more GRC people at staff of three and then we had a public relations person so that helped out a lot because as a university they want us to do a lot

of outreach to the community and it was just too much for us to manage that within ourselves so we added a public relation perfect person to help us go through and do these type of things so we had all these people now how are we going to get them trained up to be useful right it's very hard to get someone that already has security experience so what we do is we do a lot of sans training we bring in a lot of product training to get them up on products than we do individually within our group Tech Talks so things that other people are familiar with will will have a bi-monthly meeting where they'll go through and say hey I've figured out

this tool and let's implement it and then mentoring and this is really important for our group normally how we do it is well we have all the you know standard operating procedures and we'll sit people and we'll show them how to do it and then we'll give them the chance to go through it and then as we're sitting there and watching them we're always asking these questions why are we doing this this way why wouldn't we do it this way what makes sense you know and always challenging them to make sure that a they understand the process and challenging them to hey if you see an improvement let's make it and it takes us about 18 months to have somebody

completely self-reliant where I don't feel like they're coming in every day asking me questions about how how do I do this how do I do that so it's a long process so for slow response time the first thing we did was add a sim and we got more eyes on glass it's just what we needed and we also have automated prioritization based upon where the people are at in different apartments and what kind of data is on their systems those are obviously the things that we go first so here you can see so 20 beginning of 2014 to the end of 2014 is where we implemented our sim and then we got more people so it's pretty

dramatic change from you know 2013 was taking this 29 days on average so I'm gonna learn to come up for somebody to go through and triage it and then by the next year we're now down to five days from the time of alarm comes up so just from the sheer fact of us adding more people and the prioritization and having a sim it was a pretty dramatic change in that last couple of years it's it's slowly decreased but not not dramatically so we can pretty easily attribute to just getting more people and adding some better lock monitoring to to this direct change so slow analysis right it's always tough you're going through and you know you want to

get faster and closing there the incidents that you're working and so things that we did was one thing is we we changed how we're analyzing I read a post back on storm center last month about prioritizing incidents and where to get the data so for example the the top one up here is for fishing so if we have a fishing incident primary piece for primary in this case we would look in our bro bro logs are awesome full packet capture and smtp logs right and so there we have lots of other things we could be looking at but to answer the questions we want to answer these are the primary things to look at and so to help guide our people

through the most efficient process not that they're you know not smart enough to use these other means but these this is the most efficient way to get the answers that we're looking for and so we've mapped all these out in each one of these how to use bro how to do these are all are part of standard operating procedures but it seemed like and I'm guilty of this too we find something we're really interested in and we jump down the rabbit hole which is fun and cool but it wasn't necessarily where we needed to be in the analysis process so this has been very helpful for us other thing is you know just across the

industry this has been happening is reduced the number of whole disc collections that we're doing mostly due to the insurance pots tool we can remotely grab whatever we need and grab memory timelines all that kind of stuff remotely there's no reason for us to grab the whole desk now when we do now it's mostly for compliance reasons HR reasons legal criminal investigations are if we have a confirmed boss will make sure we grab a hope is so with us collecting less data we can analyze stuff much faster and also full packet capture is awesome if you guys won't have full packet capture I feel for you I used to be there we would have a snort

alert and I'm like all right I got one packet and then I have flows and I'm like what happened you know and so now you have to normally get on disk to start seeing really what happened or make some educated guesses if you're fortunate enough to have bro installed maybe you have a good idea what happened but without that it can be difficult and with full packet capture a lot of times all we can see the guy talked out to the C and C and C and C was already down by the time he talked out so we're safe alright so this those kind of inferences can dramatically speed up our analysis process so this goes through our total

investigation so down at the bottom of the year so in 20 13 these are actual investigations these aren't incidents alarms this is people collecting data analyzing it seeing if the incidents confirmed or not so in 2013 we had 103 incidents we investigated and we had 741 hours spent on on those the next year in 2014 you see a pretty big dip and that's why during this process we were installing all these technologies that we're talking about today so we were doing a lot more engineering or less responding and so that's why that number went down so 2013 and 2015 is pretty good comparison you see the the uptick right you're going wow why was this and the

reason partly is we had a couple of big incidents where we had 15 to 20 systems compromised all at the same time so that was a little more complicated for our guys and plus we're doing a lot of mentoring so each time on an incident while we're mentoring we've got two people working the same incident so we're getting double time but as you can see this year we've got we've had our guys in place for several years we're able to process many more investigations with a lot less time and this is up to the beginning of August here you can tell from this is a better graft really you can see as our people get more

familiar with our incident process and learning and us improving our processes and using the tools more efficiently that we're now able to get an average investigation down to about two hours this is on time this isn't closing time I'll talk about that next but actually when they're looking at it they can look and analyze the data and determine if the Internet has happened and proceed on within two hours so this is a fun when we actually didn't collect this metric at the time but this is time from data collection to analysis so we get an alarm and I have an IP address now the problem is is where is that computer on a large University Network this is a

major major problem we have owned this is a guesstimation it's about two and a half days on average before we had our IR tool and the process goes like this we we hopefully we hope that it's in the IP spreadsheet so we have a list of networks and who's the manager of those networks it's very inaccurate people are always you know starting and leaving and so it's very inaccurate then we have to check neck locks hopefully they log into an axe egg meant that we have username password all that good stuff for them if not we checked DHCP and hopefully their computer name is you know something that we can recognize is not student 1 and

then if not we in map it and hopefully we can log into it to be RTP and see what the username is if that doesn't work right so we we've already spent 45 minutes trying to track this system down we have to block it we just all right block it and hopefully somebody's going to scream say hey I can't get onto the internet what's wrong so it's very very complicated and the problem was is as soon as somebody gets wind of systems compromised now you have the possibility of them running AV are logging on and say hey I found it and we delete it right and they've just stopped all over your investigation and this was a very

very common so now with our incident response to a pre deployed we don't even tell them about it until we can confirm that the incident happened it's just so much easier now we don't we're not deployed everywhere I wish we were and we still have to go back to the old analysis method when we do but average data close so these are this is based on 24-hour day not a 12 eight hour work day or anything like that so in 2013 it took us an average of ten days to close an incident and as we really got the instant response tool deployed mid-2014 ish so you can see from four and a half days on down to how how quickly we can

now close an incident so it's about four and a half hours so the difference is is it's about 22 hours on average for us to analyze and so it's about two hours first to go out with instant response tool collect what we need and analyze it so distributing environment already touched on a lot of this a policy for requirement of centrally managed tools it's very very difficult to get such a distributed environment all working on the same thing when there's no way to deploy tools with a single tool it's very difficult and so you have to meet with the deans apartments head we have them do a self-service questionnaire every year and then we talked to the Dean saying

you know this is what we expect this is where you guys are at you know what should be the next steps and then the other thing is just to get general acceptance the tools have to be very easy to manage and self update because a lot of the lot of the small departments have no way to push push things so they are still sneaker and adding things around so if we can't have an automatic update tool with very little overhead it's very hard for us to to get people to install this stuff so this is probably the most important time saver and again it's hard to quantify this this is unnecessary responses so previously what we'd have to do was if

we saw system was compromised we would ask the ad man or the user hey do you have sensitive data on the system and ninety-nine percent of the time what did they say no way right well in about 75% of cases they did you know almost always when we went collected and did that they had some kind of sensitive data well now with us having the data discovery tool on all these systems we can quickly look is okay this system you know it doesn't have any sensitive data it's not a critical asset it has commodity malware we're simply just going to knock it off the internet and tell them to wipe it and so this the sheer untold hours of us

not having to go through every compromise that we had on campus has saved us untold untold hours it's been a huge huge asset for us so key takeaways start gathering metrics now so if you aren't gathering good metrics it's going to make it much more difficult for for you and your group to make cases for for buying things we already had plans when we had this incident we already had the plans in place of what we wanted to do to reduce our metrics that we already had also it makes it great case for you talked to the CIO and he's you know he sees wow it's taking you guys five days to work an alarm

let's get that down you already have that plan in place and you can hand it over immediately many times previously to the DOR breach we had we already had these written up and literally every time we'd have incident we've changed the date and put that same request in front of people and they would just keep seeing it over and over and you know eventually they got it that they needed to do this and it may be the first time it was somebody they didn't care about maybe you know it took a vp getting compromised and and he's yelling at the CIO going what's going on for him to do this and start start listening so so

don't give up you know and make sure you have your plan of where your metrics are at and what you want to do and it just makes it easier business case for you staff retention is always a problem you know training making sure that you know you're invested in these guys and let you know you care about them have a progression path for them even if you might not be able to give them a bump there's still always the hey we're working on this process don't give up and then quality Alli University of South Carolina we have a great working environment we're very flexible when I don't call a whole lot when things happen so it's a it's it's very nice to

manage the sock and not get caught two in the morning every day so what have the most impact that's the question I get a lot out of all the tools if I had to get one so we already had a lot of other tools in place so don't think if you only had one tool this would be the one but in this case the data discovery tool saved us so much time not analyzing systems that would be my first choice that would dramatically reduce how how many incidents we worked the second is in point forensics in our case again our distribute environment this saves us a lot of time being going out immediately to reach in and collect stuff full

packet capture again a huge time saver when you're looking at network traffic and then the same last we already had seemed like we had security onion deployed and so we already had a lot of sim type features and it wasn't that big of a deal as some of the other stuff so again so what do we have next we want another additional position to manage systems so we already have that pd written and we're you know trying to work on that process so the next time somebody asked us again why is it down to five days well half you know our team is all splitting up our time managing some of these systems we need a

dedicated person to manage the systems and then we can perform more I our processes automated quarantine that's just us doing some API scripting for our instant response so it and go out there and quarantine them so again that will reduce our time to respond do more threat hunting on campus and then just a better use of our threat Intel to respond to stuff faster and make better priority changes so when especially thank my sock guys our CIO James Perry and the rest of our team without all their hard work we wouldn't be able to have these awesome numbers to show people questions these tweets lieds they'll probably be on my github so yes sir yeah so dlp you know most people say

dlp right so you're familiar with that word scanning for sensitive data SSN bank account number any of those type of things on you know the number ones are dimly finder and then McAfee has their own dlp we aren't using the Prevention portion because at a university I would pull all my hair out right but the important thing in front from our perspective at this point is people be able to know what data is on their system and then be able to remediate it that's our main goals for them to actually remove the data but if they have it we have reporting and then they have other requirements they have to mean like for example do you put in you

compare with like regular expressions yes yes absolutely yes so that's what it does it will crawl crawl the different files in notice how to parse and does a regex expressions for burnham credit card bank account numbers and however you set it up most of them now have a huge predefined library for HIPAA and all that kind of stuff and you just turn this on and have it scheduled to scan either on a weekly monthly whatever basis you feel like it necessary yes sir

it's not very a lot of interest to your team so you reduce like what your time maybe you'll give up someone say why trailer that system project any artifacts you collect from those systems so that later you for that and determine with something additionally so the question was did everybody hear that question sometimes we do sometimes we don't it really depends on how commodity it is if it's something that we see all the time know if it's something new in our environment generally we will and we feed our IOC's into our ticketing system and then though that gets propagated into our instant response tools and those type of things but but not always its I'd safety percent of time we do

question 18 months to train somebody yes yes well it depends own yeah so I would say we generally would do at least two Saiyans classes so you're roughly you know six thousand per that and then at least 22 other training classes so maybe ten ten twelve thousand potentially times up do we want to okay thank you guys feel free to send me a Twitter or our email if you have any more questions thank you very much oh yeah yes batteries engine was about halfway mom what was the catalyst for us to start this project postal