Hype and Reality: Practical Advice for Implementing and Evaluating AI/ML for Cybersecurity

Name: Hype and Reality: Practical Advice for Implementing and Evaluating AI/ML for Cybersecurity
Uploaded: 2021-06-19
Duration: 29 min 45 s
Description: Edward Wu separates AI/ML hype from cybersecurity reality, examining where machine learning is genuinely feasible and where it falls short. The talk covers practical applications—behavioral attack detection, alert prioritization, and assisted investigation—and offers concrete recommendations for tea

BSides SATX · 202129:4535 viewsPublished 2021-06Watch on YouTube ↗

Speakers

Edward Wu

Tags

CategoryTechnical

TopicAI Security Detection Engineering Threat Intel

StyleTalk

About this talk

Edward Wu separates AI/ML hype from cybersecurity reality, examining where machine learning is genuinely feasible and where it falls short. The talk covers practical applications—behavioral attack detection, alert prioritization, and assisted investigation—and offers concrete recommendations for teams implementing or evaluating AI/ML solutions, emphasizing the critical importance of domain expertise, data quality, and realistic expectations around model performance.

Show original YouTube description

Title: Hype and Reality: Practical advices for implementing and evaluating AI/ML for Cybersecurity Presenters: Edward Wu Track: In The Clouds Time: 1300 Virtual BSides San Antonio 2021 June 12th, San Antonio, Texas Abstract: For a long time, AI/ML has been portrayed as the magic "silver bullet" that would solve everything in cybersecurity. In this talk, I will separate the hype from reality, present real-world examples of where the application of AI/ML is feasible and beneficial, and highlight challenges and limitations. At the end of the talk, I will also provide concrete advice on how to best implement and evaluate AI/ML technologies. Speaker Bios: Edward Wu leads AI/ML and detection capabilities at ExtraHop Networks. He specializes in the intersection of machine learning, software engineering, and cybersecurity, and has built innovative next-gen technology for behavioral attack detection, automated security operation, network/application monitoring, and cloud workload security from scratch. He holds 10+ patents in ML and cybersecurity, co-authored 3 papers in top academic security conferences, and is a contributor to MITRE ATT&CK framework. Prior to Extrahop, he worked in automated binary analysis and software defenses at UW Seattle and UC Berkeley.

Show transcript [en]

hello my name is edward and today i'm going to talk about high-penned reality practical advice for implementing and evaluating aiml for cyber security a quick disclaimer before we get started see opinions expressing this presentation are my own and do not reflect the view of my employer

one slide about myself i'm currently the aiml and detection lead at actual hub networks i founded extrahop's cloud ml behavioral attack detection service six years ago and i previously worked on automated binary analysis and software defenses at uc berkeley and uw seattle finally a bit of fun fact about myself i built the first working remote code execution exploit for juice bonnet a decade ago while i was working as an undergrad research assistant my talk today is going to split into two sections in the first section i'm going to talk about what aiml can and now do in cyber security today and for the second section i'm going to talk about a set of recommendations for

practitioners who are either implementing or evaluating aiml-based cybersecurity solutions so let's start it with the height so the first type we're going to talk about is the claim of having a single holy grail machine learning algorithm that can solve all the problems in cyber security in reality application of aiml for cybersecurity involves solving many different problems and as a result different algorithms are needed to solve different subtasks it's kind of similar how autonomous driving systems does not involve a single machine learning algorithm but instead utilizes dozens of different ai ml modules on the bottom left hand side you can see a visualization or architectural diagram of udacity's autonomous driving system and you can see there are

a few different sub modules such as perception object detection localization planning and control that are working in unison to deliver the final autonomous driving behavior the second type i want to talk about is detection with no false positives aiml based attack detection is very popular today and a lot of vendors claim that their aiml-based solutions are able to identify attacks with no fast positives however in reality even detectors based on perfect data perfect algorithm and perfect domain expertise will still require additional human anal analysis and could still generate false positives the reason of that is the definition of secure actually varies greatly across customers and organizations sometimes what separates an attack and apk denying behavior is actually the

underlying operation operation context or the intention of the behavior for example in an environment user a performing sifs shear enumeration will be considered as malicious but at the same time the exact same behavior could be benign if user a was responsible for performing a new data mapping initiative and frequently during operation a lot of these contactings are actually unavailable to the component performancy analysis so it's incredibly difficult for those perform for those components to read operators minds and understand the intention of behaviors observed the third type that you guys might have seen a lot is a claim of autonomous easy burden however the reality is that the cost of incorrect remediation and quarantine can be extremely high

depending on the operation criticality of the asset where the false positive has been identified current state-of-the-art technology in general is not able to respond to attacks autonomously without risking taking down or negatively impacting normal business operation given that response automation can still be achieved in very specific environments and scope but prevalence of false positives definitely make it quite risky when trying to operationalize it in a very broad deployment we talked about a few claims that are unrealistic let's now look at how aiml can actually assist in cyber security today so right now aiml can already help with many analytical tasks and can be utilized to convert data and telemetry to actionable insights some examples of these tasks include behavioral attack

detection prioritization assisted investigation as well as many others well far from perfect most of the time aiml if done right can be an extremely powerful force multiplier for security teams today now let's look at a few of these applications the first application i want to discuss is behavioral attack detection in general behavioral attack detection refers to the practice of observing runtime behavior of certain entities and identify potential attacks these entities can be executables users devices or servers on the network as well as groups of devices or subnets at the same time runtime behavior can be recorded in many different formats for example on the agent site runtime behavior can be recorded as binary execution trace

as well as log data such as system logs application logs or vpn logs in addition to that runtime behavior can also be reported in the format of netflow as well as metadata extracted from parsing network packets aino for behavior attack detection is actually a very popular application domain compared to traditional heuristics and rule-based approaches aiml-based solutions provide significant advanced noise and at the same time offer ability to dynamically adapt to the environment that's being protected in addition to that ai ml solutions are able to leverage operational contacts which is critical to the detection of sophisticated attacks or unknown unknowns let's now look at an example of how to detect suspicious ps exact activities in the environment so

for those of you guys who are not familiar with ps exact it is a popular remote access tool then enables users to access other windows machines without pre-deploying any server or agent it is commonly used by administrators but at the same time it's also frequently utilized by attackers to move laterally across different windows machines and when combined with mini cats used for privilege activation so in order to detect suspicious ps exact activities in the environment the stock analysts or cyber defenders might start with a relatively naive heuristic based approach where they were just alert if any use of ps exact was observed in the environment it is quite obvious that this approach or heuristics will be very noisy due to

different administrators benign users of this tool given that information or observation the cyber defenders might chose to level up the sophistication and refine the approach by adding an additional condition so that the alert will not erase if the client device is in the set of known administrator laptops or workstations however this approach is also not easily operationable because it requires constant maintenance as well as tuning to reduce noise caused by other non-administrators or management tools that use psxm

so one possibility of applying aiml for the detection of suspicious ps exact involves using unsupervised learning algorithms so unsupervised learning algorithms are algorithms that can identify underlying patterns in input data and mark statistical outliers these algorithms are commonly used for building predictive models for behind benign behaviors in the environment and they are capable of self-adapting to changing environments for example introduction of new users or devices as well as changing underlying behaviors conceptually unsupervised learning algorithms are a good fit for implementing a an unusual operator in detection rules and sample and supervised algorithms include clustering isolation forest pca and principal component analysis as well as vae variable auto encoding so in order to apply unsupervised learning

to enhance the efficacy and accuracy of suspicious ps exact detection one can build an unsupervised machine learning model that actually learns the common uses of ps exact based on different factors over time and these factors could include time of use as a user that initiated the session the client device the role of the client device as well as server device and the role of the server device and the machine learning algorithm could alert if unexpected use rps exact as being as observed the benefit of applying machine learning for suspicious ps exact detection is that this new model will be able to automatically adapt to different benign uses of ps exact in the environment across multiple factors in addition to

that it could also help to detect really subtle or potentially stealthy attacks for example when an iq administrator starts to run px exact from a finance machine during the weekend and remotely the next practical application of aiml i want to discuss is privatization today security teams are always overwhelmed by alerts and vulnerabilities and good prioritization helps them to maximize our eyes of their limited time and resources it also makes a lot of sense because in the consumer technology wrote the same aiml based recommendation systems had been widely adopted and are the driving force behind many type joints today you can already see or experience a lot of these systems in the form of digital content

recommendation ad targeting as well as merchandise recommendation let's look at a concrete example of an alert prioritization project to get started the stock analyst might choose to build a rule where alerts on devices in a specific sub-map should be prioritized first however it's obvious that this approach is not very precise or accurate because in general crowd heading alerts require analysis and also depends on many different factors a more refined heuristic approach might involve the analyst building a complex point space system that increases and decreases relativity priority based on different factors however this approach is still very manual and it's very hard to scale especially when the number of factors increases above 10. so prioritization is one area where a

category of machine learning algorithms called supervised learning can help in general supervised learning algorithms are capable of learning the relationship between the input data and the output label supervised learning algorithms are already used in areas such as image classification spam classification optical character recognization as well as property price prediction simple supervised learning algorithms include svm support vector machine linear regression logistic regression as well as neural networks to continue on the example of the alert prioritization project see practitioners can leverage supervised learning to build a model that is able to predict the priority of a given alert based on a historical data set containing manual triage alerts essentially in this case the machine learning model will be able to infer

the relationship between relative priority and multiple factors by observing how the analysts or the security teams have triaged alerts in the past the third practical application area of aiml i want to touch on is assisted investigation it's obvious that modern security tools generate a tons of alerts and the alert investigation is often extremely time consuming i think there are reports saying on average investigation of single alert could take up to 20 or 30 minutes in this area aiml can assist by automating the process of context gathering as well as incident grouping so for context gathering aiml solutions are able to simulate the human analyst behavior of for example going to 15 different dashboards or queries

and look for unusual behaviors or spikes at the time of the alert for incident grouping aiml solutions can analyze multiple alerts and identify whether there are pre-existing correlation among them that might indicate some sort of multi-static tech campaign or a bigger incident so we touched on a few applications of practical applications of aiml and let's now change gears a bit and talk about some recommendations to get started i want to talk about recommendations for practitioners who are actually planning to implement aiml solutions or incorporate aiml to some of their operations the first recommendation i have for implementing aiml is the importance of peculiar battles in general application of aiml takes a lot of resource and the roi might not make sense for

certain low-value problems in addition to that it's much easier to apply aiml on tasks that are already well defined and narrowly scoped associated with the existing data sets as well as can be solved or mostly solved by human heuristics the second recommendation i have for implementation of aiml is to focus and make sure you have all three key ingredients in general successful application of aiml requires a good combination and blend of these three key ingredients which are data data science and cyber security don't make your keys most of the time quality and quantity of data has the biggest impact on the success of an aimr application however one common problem i have observed is that a lot of times security teams

will hire a data scientist and simply point him worker to a large collection of cyber security data without providing the necessary domain knowledge first and unfortunately it's not going to work most of the time because in order to properly apply data science the data scientists also need to have a reasonably good understanding of the cyber security problem he or she has been taxed with solving the final recommendation i have around this is to expect and increase hydration similar to other applications of aiml no machinery model or capability is remotely perfect even after the first time iterations and from my experience this is this is especially true for cyber security applications whereas a problem domain is often not well understood or

extremely complex and dynamic a good rule of thumb is to expect building a prototype with 99 accuracy being only 25 percent of the journey in addition to that for aiml based detection projects one should expect false positive reduction to be the most challenging bit

next let's change gears a little bit and talk about some recommendations for vendor evaluation there are a lot of aiml based security products today and you might have heard a lot of vendors bragging about their proprietary algorithms or sophisticated algorithms such as deep learning and neural networks however in practice complex algorithms typically require exponentially more data to train and often are not necessarily the best algorithm for a given problem from my experience applying a single simple algorithm to the correct set of input data typically produces far better results than applying sophisticated algorithms to the raw data for a concrete example let's look at database brute force detection in general database error logs plus simple ml

algorithms will typically outperform the application of a sophisticated algorithm like deep neural networks on data that contains less signals such as tcp packet links inside this another recommendation i have or the practitioners should look out for is to pay close attention to the quality and quantity of the input data being used by the modules a lot of times we ingest everything is usually a yellow flag because effective application of aiml requires in-depth knowledge about the input data it's actually very very difficult to build effective aiml modules that will work on arbitrary data in addition to that another area practitioner should pay a lot of attention to when they're evaluating products is the false positive rate

i've seen a lot of practitioners focused heavily on fast negative rate or the ability to not miss real attacks during the plc's while it is extremely important for the aiml-based solution to not misreal attacks noise or the fast positive rate actually is frequently overloaded in general noise has a huge impact on the operationalization and the production efficacy of a aiml product because even if the product is able to not miss any real attacks by simply generating a lot of voice the product could quickly cause the security teams to lose confidence and stop using it

another factor to consider when evaluating aiml products are the scale it's obvious that aiml solutions in general require a lot of compute resources and mature ml solutions could involve dozens of algorithms and millions of models per second there are two ways to typically deploy ml solutions one is to run dml algorithms and models on board of some sort of on-premise or virtual clients the other way is to offload features to the cloud and execute machine learning models in the cloud in general on board and ml solutions are constrained by the local compute resources and typically perform worse than the cloud-based solutions and you might have concern around the ability for cloud-based ml solutions to be effective

without the access of sensitive pii data however at this point there are a number of techniques that could be utilized to assist with ml algorithms and help them to achieve very high level of access efficacy even without accessing any sensitive pii data in think tanks my final recommendation for my aiml product evaluation is to test in realistic environments most aiml solutions are actually extremely sensitive to the environment because similar to building physical devices building complex aml solutions that can operate across a wide range of networks is actually very difficult analogy will be to build an ai doctor that is now table not only able to detect diseases in humans but also all the way down to single

cell organisms we have things aiml solutions that actually works perfectly in small lab environments but simultaneously quickly falls apart in realistic environments for example with 10 000 devices in addition to that keep in mind that many vendors actually include some form of poc mode in their aiml product to artificially increase the sensitivity for small rp for small lab poc environments if not clearly this goes these practices are somewhat as equally questionable and often leads to reduced efficacy post-purchase because cpoc mode cannot be turned down in real world deployments

in conclusion today aiml provides a practical and scalable way to automate the conversion from data or telemetry to actionable insights in cyber security at the same time we are still at the early days of applied aml and i will personally characterize our maturity today as a form of intelligence or my augmentation and not autonomous intelligence in general aiml has been overhyped and oversold by a lot of security vendors but there are a few rapid role vendors who have been making investments for years and have products that actually do what they think that's it for my talk thank you for your [Music]

[Music] time

do [Music] do [Music]

[Music] you

Hype and Reality: Practical Advice for Implementing and Evaluating AI/ML for Cybersecurity

Related talks