Current State of Virtualizing Network Monitoring

Name: Current State of Virtualizing Network Monitoring
Uploaded: 2021-05-17
Duration: 25 min 39 s
Description: Examines the viability of virtualizing and containerizing network security monitoring devices like IDS/IPS systems, packet capture, and netflow collection. Presents two proof-of-concept deployments: a low-cost all-in-one monitoring solution using open-source tools, and a high-speed 10G+ virtualized

BSides Charm · 201725:391 viewsPublished 2021-05Watch on YouTube ↗

Speakers

Ed Sealing Daniel Lohin

Tags

CategoryTechnical

TopicDetection Engineering Network Security

ResearchCase Studies and Incidents Analysis Technical Deep-dives

StyleTalk

Mentioned in this talk

Tools used

Elasticsearch Kibana Logstash Moloch OpenVSwitch Redis Snort Suricata Zeek

Platforms

Docker

About this talk

Examines the viability of virtualizing and containerizing network security monitoring devices like IDS/IPS systems, packet capture, and netflow collection. Presents two proof-of-concept deployments: a low-cost all-in-one monitoring solution using open-source tools, and a high-speed 10G+ virtualized monitoring system leveraging SR-IOV and DPDK-enabled OpenVSwitch.

Show original YouTube description

Current State of Virtualizing Network Monitoring This presentation will look at the viability of virtualizing and containerizing network security monitoring devices such as IDS/IPS systems, full packet capture, netflow, etc. There are a number of challenges in a virtual environment with managing system load. We have been looking at how to best virtualize open-source network monitoring solutions in both large and small environments and will detail some of the information we have learned during this adventure. We will detail a project on a single inexpensive host providing network monitoring and event collection built entirely on Open Source software. Finally, we will discuss and demo high-speed (10G+) virtualized monitoring solutions with newer technologies such as SR-IOV and DPDK-enabled OpenVSwitch. Presenters: Ed Sealing and Daniel Lohin Ed Sealing and Daniel Lohin both work at Sealing Technologies. Their focus is primarily security engineering and figuring out how to securely build enterprise scale systems in a manner that is functional and secure. Ed is the CEO of Sealing Technologies and has over 15 years in IT and Security within the Federal Govt. Daniel Lohin holds a Masters from George Mason University and also teaches part time at a local community college.

Show transcript [en]

so uh good afternoon um my name is ed um we're gonna be talking today a little bit about the current state of virtualizing network monitoring um this uh talk isn't as cool and sexy as brand new vulnerabilities but this is a little bit more on the engineering side of of where the future we think of network monitoring is moving to um we got the uh the after lunch snooze slot here so um so we'll try to keep it a little bit lively as much as a couple of uh engineers can so okay so so the first thing is uh why why are everyone talking about this why are we even talking about virtualizing network monitoring

um so the reality is that uh that virtualization is is here right it's not going anywhere um uh it's only making it's only becoming more prevalent and there's two major reasons that we're looking at specifically network monitoring in a virtualized environment the one is that the data is moving to virtual areas so we have to move the network monitoring to where the data is the second reason is that the same benefits that we get from that applications get from virtualization uh we can actually in the security community we can benefit from from that as well that's things like scalability um that's things like um fast deployment right now you can go to something like amazon or rackspace

and spin a spin up an application in you know a matter of minutes and uh a lot of times us in the network monitoring security community you know if we want to deploy a new ids or whatever sometimes it takes months right so so we're trying to get better at this and improve upon the way that we do uh uh deploy new tools and new techniques in the same kind of fashion of leveraging the skills that um the capabilities that virtualization has brought to applications um so okay so again why is this important so it also allows us to do right sizing so in the in the world of network security you know we're talking

firewalls ids's um web content filters uh full packet captures um uh netflow monitoring all those kind of capabilities a lot of times right now those are uh actual hardware appliances right so the problem with the hardware appliance is um bandwidth is moving really fast right now right so a lot of times um as a uh entity grows it needs to pick up more bandwidth so you might start out like like a small company might have a hundred meg pipe but if that company grows suddenly it's got a one gig pipe it continues to grow suddenly it's got a two gig or three gig pipe and every time you have that kind of growth uh the ability to scale like a new um

cisco firewall or a new um palo alto or circuit or whatever a lot of times you have to rip and replace right okay we're upgrading to two gigs well my firewall online has a one gig port so i've gotta rip and replace and uh and again we're talking about month you know multi months of deployment um so virtualization allows us to kind of right size that right and scale out a little bit better that's the same reason that applications uh move towards the cloud um there's this new uh you know these new nfv and sdn and and i'm gonna try not to use the term sdn too much because it is way overused everybody's saying they do

sdn right now it's never really been defined but but it is something that is applicable to us as network monitoring folks and the the last thing is that uh we wanted to see if open source was really ready to meet all the requirements um so we we took a look at a lot of things that uh dan's gonna talk about we kind of hobbled together a ton of different open source tools into one big virtualized monitoring and and just to see what it can do so with that i'm going to turn it over dan real quick and he's going to cover some of the awesome stuff that he'd been working on so we individually we built two

prototypes uh the first one was uh we're gonna build an all-in-one type solution so we wanted something that did the ids monitoring all the packet flow that you would expect from a an enterprise correlate all the logs together put them in a single central repository have a way to query out the information search and and so forth so uh that was our first solution that we're going to talk about the second solution is we wanted to build as big of a an ids bro box that we possibly could so in this situation we we were more concerned with how much bandwidth can we actually process out of a single virtualized instance and that's uh going to be the second

project we're going to talk about both of these projects were between three to four thousand dollars which depending on on who you are that may or may not be cheap uh but uh we wanted to just show what was possible uh all the software that we used was freely available online so there's nothing in terms of licensing costs so we all our money was in in hardware costs alone uh the first solution this is what it looks like um it's very portable i'd say it's a little bit bigger than a six pack of beer i even put a picture of a beer in there just so you get an idea how large it is it's not very large it probably

weighs about 20 pounds or so and can be used as a portable type solution uh the the specs for it is an eight core zeon d if anyone's familiar xeon ds are relatively new uh they're very low powered uh intel chips the one we went with is about 45 watts it has two 10 gig interfaces and two one gig interfaces uh this particular system we put 128 gigs of ram in it because it's going to be kind of doing everything and uh 24 terabytes of raw storage in raid 10 so that has it and then we also to keep up with the performance we put in an 800 gig enterprise ssd cache drive so it doesn't

actually add anything to the storage what it does is it speeds up the disc performance so that it can actually keep up and if we tried if you pull the cash drive out the whole thing comes crashing down because it can't keep up so the cash drive does work very well for for what its intention attention purposes some uses that we thought in operational field security traffic monitoring maybe you're you have a specific host or sub network that you want to monitor more because you're expecting something is is going on and you want to be able to throw a box in there and start pulling information off of a tap possibly a small portable training

environment uh if you hardened it and made it a beefier unit to travel more it would probably work very well for that something where maybe you you have an operational site that has just very poor bandwidth you wanted to be able to collect a bunch of logs but you know that you're not going to send the logs to a central like office or location so you store them locally and then you can periodically go in and check on them however you would do that finally uh maybe just computer network defense for small businesses or large businesses that have lots of small remote locations total we got this to about 300 megabits per second which is

is not bad for for everything and we'll show you all the the capabilities of the box soon the hardware architecture we do have a separate firewall v a vpn proxy as well as an ips that was kind of there for legacy reasons i'm fairly confident it would be easy enough to move into the single virtual box but it was already there we just never touched it uh we never upgraded it and what we're doing is we're just pulling everything off of a port mirror off of a dell switch because they're virtual environments each of these little red boxes inside the virtual server is a virtual machine right now um and we needed a way to copy all the

traffic to all our ids's so we used open vswitch which is a a module that goes into linux and it allows you to treat all your virtual machines as if it were a switch and it gives you a lot of extra capabilities that the traditional way of networking was not able to do we also we we we have a bunch of extra systems in here so uh we do have performance monitoring we have single sign-on for managing accounts we also added puppet and uh foreman for doing patch management configuration management and a bunch of other things so there's a lot of extra configurations in here just for managing this single box the more interesting part of it is

is what our logging architecture looks like so uh what we did was we have a bunch of different log law gathering tools and we're collecting from a whole bunch of different sources in particular if you're not familiar there's a tool called packet beat which is made by elastic company it's free it's actually pretty pretty neat uh it does net flow monitoring into elastic search uh cerakada is an open source ids bro is monitoring and recording information about all the the network packets and and contextualizing those our syslog we're collecting from collecting from all the linux systems in there win log beat we added for if you need windows uh so we have a couple of

windows boxes that we're collecting information from uh pf sense and snort we're we're also collecting information from so what what's being dropped what's not and allows us to contextualize a lot of that and finally we have a tool called moloch which i'll talk about in a bit it's a full packet capture device what it does it indexes all the information into elasticsearch and then saves the raw pcaps so you can pull them out so you can say hey i want to see what http request this is and i want to see the full packets it's actually a pretty cool tool for being free it's made by aol i highly recommend people check it out they still exist they make a free open

source tool that is actually pretty awesome so uh originally if anyone's ever set up elasticsearch the general recommendation was to pump it into all the logs first in the radius which is a in-memory buffer i skipped that first because i thought i wasn't that big enough and i actually did find it was pretty important even for this little single um single cluster instance so redis collects logs stores them in memory until log stash can actually handle them one of the big surprises i had was logstash was actually the biggest resource hog on this entire system and logstash sort of segue into it that is what enriches the logs so it adds things like geoip information of all the ips it sees

it also takes anything that isn't json and turns it into json and you can add all sorts of information you can send messages to slack based on certain events which i'll show which is pretty cool um and all sorts of other stuff it's pretty open ended uh programming type language and works pretty well finally elasticsearch is kind of the where everything is stored um that's sitting on a single vm it's a given 32 gigs of resources it is pretty pretty big and then kibana is the tool that you use to view information out of it this is a picture of our slack so one of the cool things about using slack for doing like messages and stuff is you can check it

on your phone and stuff like that so this is this is what it looks like i get slack messages when people log in and slack messages when people connect and all sorts of stuff so logstash is what's actually doing that so in this process we're able to do about 300 mags which is what my final goal was we could probably go a little bit more we were doing about 4 000 events per second just from all the different log sources coming in during that time the the system was using around 90 to 126 watts of power which is not too much if if you're worried about power and efficiency uh the the three most taxing services

were logstash moloch capture and seracotta were the the three big ones [Music] this is a picture um that we thought was funny because right now this is sitting on my home network and i kind of creeped my wife out when i was showing her that was recording everything she was doing but this is a a picture of them making fun of me so [Music] okay so so yes uh i don't think dan made it really clear this that's literally everything he showed you is what's running his home network right now it's um uh and again the point of this wasn't i know it's like oh well can do 300 megs um but uh the point of this really was to

say how how cheap of a box can we get uh and put like massive amounts of open source capabilities into one box um and that uh we were actually astonished when we got 300 megs out of it um that that was incredible for first a box that was you know four grand piece together on amazon so but uh but so we started looking at it we said okay you know what uh how do we scale up you know if we're dealing with um the customers that we work with deal in the the 10s and 20s and 40 gigabits per second how do we actually bring this kind of capability to them um so the first thing i want to bring up

real quick is uh that the new world when we talk about virtualization containerization um sdn that kind of stuff well in the security context what we're really talking about as network monitoring folks is that we're not dealing with pipes anymore we're dealing with individual flows and and once you um uh once i kind of understood that things really started to emerge about how powerful virtualizing could be so as an example you know a lot of times we think of perimeter defense we think of a single cable coming in and we chop that cable we put a bunch of boxes in between on each side of that and we start monitoring it but but really that's not the way security

works right um there might be 10 000 clients behind and each one of those clients making a connection out is a flow right so if a client's making a connection out http there's specific um boxes that we want to send it to but if it's making a connection uh https then there might be other boxes we want to send it to that's not that so um that's some of the stuff we want to think about that we want to talk about is that we're not talking about big pipes anymore we have to look at it as individual flows and we get a lot of power out of that so so i'm not going to dwell on this too

much basically a lot of things are getting consolidated right now the movement towards virtualizing and data centerizing and all this stuff is to consolidate it um budgets at least in our customers you know we mostly work in the federal space our customers the budgets are are getting um compressed really terribly uh there's all this brand new um open source uh uh software in order to do this type of thing because the application side of um of the market uh the services side is demanding it so there's a whole lot of new tools out there like opnfv open v switch dpdk fido those kind of things that are now emerging um that that live up to

uh allow us to as security professionals to do some really interesting stuff um so i'm going to skip over some of this stuff pretty quick basically we we're going to talk a little bit about multi-tenancy and why we kind of focused on that um the background of this is uh is some of the projects that we worked on were focused around for our customers we're focused around um consolidating uh network monitoring so so think of it as having one if you've got a bank with uh 50 branches uh instead of putting an ids at each one of those branches we put a big ids and then route all of the branches through a regional uh security stack if you will

um so there's a lot of background on on how that works some of the um uh requirements that we had were uh in the projects that we work on is that uh let's say the bank has two different regions those regions should be able to go in and and not affect each other's policies which can be a hard thing to do in um in virtualization so if if you've got multiple firewalls or i'm sorry multiple ids's snort boxes let's say living on one physical server if i go in and put a bad rule on my snort box that just bogs it down and eats all the resources it's going to affect everybody else so those are some of the challenges and

problems that we're trying to solve there's a higher headquarters example of multi-tenancy that's where you've got two different sides and then one hierarchy that provides blanket policies again i'm not going to dwell on this too much okay all right so when we were looking at this when we wanted to talk about scaling it out and and getting into big bandwidths we wanted to look at um bare metal versus containers versus uh vms um i know in the security world containers and docker is is generally a a dirty word um but but we're looking at how to how to kind of leverage it to our advantage in the hopes that um in the future it's going to get more and

more and more secure and for network security monitoring generally the tools uh you're not going to compromise a snort box by sending traffic through it right you would attack it from the side so we're not overly worried about about containers anyway so so virtual machines uh you know you've got bare metal kernel hypervisor then another virtual hardware bare metal then another kernel then you've got libraries then you've got your app your app in this case would be something like circada snort elasticsearch all the stuff that dan was just saying the problem with this is that that network monitoring we're doing packet processing um when we do that packet processing the kernel is your enemy the kernel is your

biggest enemy when doing packet processing and now we're injecting two kernels um so when we move it over and look start looking at containers it made a lot of sense right and when we start comparing the performance of virtual machines and containers containers win hands down when it comes to packet processing so that's kind of what we started concentrating on there's new technologies out i'm not going to go into depth under what sriov is but there's technologies out nowadays that are specific to bypassing all of the kernel mechanisms which allows you to do like 10 gigabits per second on a single box directly to the application and user space so srov is one of those things we we

specifically used sriv because it allowed us to pull in many vlans and then parse those vlans out to individual applications and it actually uses the chips the network card the firmware in order to parse the vlans and send it to applications so we were able to get really really high speed throughputs uh when using things like sriv that's that's literally what it's made for um okay for for this proof of concept uh we use this um for the high speed one that quote high speed one um we used a uh similar setup to what dan bought for his house uh another xeon d uh uh 1541. those processors are awesome um and uh total cost was about twenty

three hundred dollars i'm gonna skip this stuff because uh the the time is going and uh long story short the intel x710 and xl710 cards are the only cards out on the market right now that allow you to do 10 or 40 gigs while in promiscuous mode so just keep that in mind if you want to do virtualize with promiscuous mode um with sriv you can't do it without this card um we ended up uh making a bro container and a circada container the slides will be available you can um we put that stuff up on github it's really easy to download it's a couple commands and you've got a circada box up and running with

hyperscan um okay so so the results what do we end up doing um so we took uh that single box that we just showed you and we stuck uh as many you know as many containers we could we spun up a bunch of cerakata boxes at the same time um we did one container two containers five containers ten containers and uh the circada had thirteen thousand signatures and we were pushing roughly four gigs through it on this on this little twenty three hundred uh dollar box across 10 vlans so so we were trying to look at the multi-tenancy and we wanted to see does spinning up more containers actually affect the uh the packet loss or the amount of

throughput and what we came back with uh you know after certain tuning uh that you could do it doesn't really affect it so so if you do this right you can actually spin up ten or more containers and it's not going to affect your uh your throughput at all so the stuff that uh dan was doing we can actually switch over and then start scaling out horizontally as our as our goal um yeah again sdn is a as a rotten term but but with sdn you can do some really powerful stuff and that's really what what um the future looks like that we want to bring up so as an example with your http traffic

let's say um the uh network monitoring that that dan was showing a minute ago if you bring traffic in if it's http traffic then there's certain boxes you want to send it through but if it's something like trusted services if there's such a thing of that then you might only want to hit certain boxes if it's dns you only wanted to go to the dns analyzer if it's um if you know that it's like type 1 encrypted traffic why are you putting it through your bro or your ids or your ips it's it's fully encrypted um you're just wasting your your uh processing power um so so you can do things when you're dealing with individual flows you can send some

flows in in some directions and other flows in other directions all with the same individual hardware and that's really what we're looking at as far as the future goes so an example of that might be if it's traffic going to an internal server you still want to send it to bro but it'll be a stripped down version of bro that doesn't use a whole lot of um policies you know it's only got specific policies that you want but if the traffic is say going has a source or i'm sorry a destination of um i don't know russia or north korea um you're going to want to send it to a different bro if you look at it as one pipe and send

all your traffic to one pro you can't you can't get that kind of um granularity but with with things like nfv and v-switches you can switch your traffic around internally based on any of the end tuple so based on the source the destination the port the protocol all these different metrics can define how your your traffic passes all within this virtual context um yeah again like things like uh if you've got a ton of traffic going to google or netflix or whatever and you don't want the netflix traffic bogging down your ids um then you can kind of switch it around okay so so where we're going with this um the we're gonna combine these two things

together and start scaling out horizontally uh we're still kind of um learning the engineering challenges of virtualizing a lot of this network monitoring stuff but um but we think we've got a lot of those problems licked now between the two um uh the the two proof concepts that we did um so we're gonna take uh what dan's done uh on his home network and we're gonna work together and basically dockerize all of it now that we see the performance benefits we get from from docker we've got to look and see what services can be made stateless once you can make them stateless you can basically click a button spin up 20 of them at once and they work right on the fly

or you can start scaling out the hardware have 20 individual uh boxes and spin up a circuit on every one um and tear it down and have your uh what's the term when you move vms around uh vmotion type of stuff um and then then our real goal is to get this thing going 20 gigs a second so uh that's pretty much all we got are we on time all right i don't know i don't there's there's not actually any time for questions are there so we'll be around uh out back if there's any questions or anything or if you want more

Current State of Virtualizing Network Monitoring

Related talks