← All talks

Shining a light into the security blackhole of IoT and OT

BSides KC · 202330:0640 viewsPublished 2023-10Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleTalk
About this talk
This presentation will explore the unique challenges that IoT and OT pose for network scanning and provide solutions for effectively addressing these challenges while ensuring the safety and availability of these systems. The Internet of Things (IoT) and the rise of Operational Technology (OT) networks have significantly increased the number of connected devices in modern networks, creating new challenges in inventorying assets, identifying and mitigating vulnerabilities, and verifying security controls coverage. This presentation will explore the unique challenges that IoT and OT pose for network scanning and provide solutions for effectively addressing these challenges while ensuring the safety and availability of these systems. The presentation will cover topics such as identifying IoT and OT devices on a network, understanding the context of vulnerabilities associated with these devices, and implementing appropriate security controls to mitigate these risks while ensuring the safety and availability of these systems. Attendees will also learn about best practices and tools for IoT and OT network scanning, such as using automated asset inventory, performing regular vulnerability assessments, and testing the changes in a controlled environment before implementing them. This presentation aims to equip the audience with the knowledge and skills to protect their organizations' networks in the IoT and OT era while ensuring these systems' safety and availability.
Show transcript [en]

[Applause] uh it's just one one uh one Amendment here uh the iot is for a longer talk since I'm doing a half hour talk I'm just going to focus on to OT I hope that's okay if you want to talk more about iot scanning you can find me uh afterwards in the hallway so yeah like you mentioned um my name is Huxley Barbie that is not a handle that is my real name I'm the only hle we ever going to meet uh somebody did ask me earlier today if that was my handle but uh you know uh these are two organizations that I'm associated with but more importantly for this talk I have spent many years as a security

consultant with customers that have OT environments so we're talking about higher ed Transportation uh manufacturing so we're talking about robotic arms uh devices that you would see on in Rail and transport stations uh research devices and so on and so forth and what I Hope For You by the end of this talk is to know more about OT than you had U did previously give you a few pointers on how to do your own security research in OT understand the challenges that come with building out an asset in inventory for your OT environment and then finally give you a few ideas on how to overcome those challenges does anybody know what percentage of chips are manufactured for

it devices take a guess any number any number will work it yeah it devices 40% 40% okay 90% of chips are developed for embedded devices only 10% are for it devices this this the statistic boggles the mind because what it really means is that the attack surface that's available to the adversary on the on the iot and OT side is much much much larger even though most of us focus a lot of our time on the leftand side over here many of these uh are iot devices right we like to joke about how like oh lava lamps and and coffee cups are on the on the internet now right but they're also rudimentary devices like home automation

uh printers IP cameras and so on and so forth but even more important are these OT devices operational technology devices that cisa considers to be part of critical infrastructure and Key Resources we're talking about robotic arms uh valves uh things that work our dams things that work our water treatment plants and so on so forth I also include in here a healthcare device often times this is called IO mty internet of medical things uh but because it is considered critical infrastructure and key resource I bucket it under OT instead uh one other thing to note in this talk is is I am using OT and IC industrial control systems interchangeably in this particular talk just just for this the purpose of this

talk all right so even though these OT environments are very important to our lives all of our Lives they are shockingly unprotected but when I say OT exactly what I what do I mean because if you're like me you grew up in an IT world right in your dorm you set up a laptop or a desktop or a tower and you were playing around with you know playing around uh with with uh with what you had most of us don't get a chance to have an OT device a PLC in our dorm so it's a little bit foreign to many of us so let's dig a little bit deeper into OT so we can improve our

understanding all right first one humongous disclaimer this is one example and out in the world you're going to see other examples that divert quite a bit from this right because on the it side we got a PC or you got a Mac right and you can use those devices for a variety of purposes Financial modeling doing homework streaming videos playing games and so on so forth on the OT side devices are generally built for one single purpose and so for that reason there's way way way more variety so just keep in mind this is one example out in the on the field you're going to see something that that diverges from this quite a bit so what we have here is a

water treatment tank there is dirty water that comes up from the left pipe and then the cleaned up cleaned up water goes down the right pipe and what we have here are these two sensors okay and so what happens is when the water level is lower than the lower sensor the water the dirty water gets pumped in right this valve opens up this pump pumps in the water and when the water reaches higher than the higher sensor uh that closes up it stops pumping and then after an hour of treatment this valve then opens up and then it drains out the pump and the valves are known as actuators and again out in the field

things might look different because in some cases actuators and sensors are integrated right lots of variety out there the thing that controls this entire operation is known as APC right it's the brains of the operation frequently this is a thing that has an IP address um and again out in the world if you are at a utility plan electrical plant the PLC might be called an IED instead intelligent electrical device this over here is an HMI and this is an interface that a technician uses to control the behavior of the PLC or through the PLC think thermostat in your house not a full computer a lock down interface that's used by a technician if you do want to rewrite how

the PLC behaves then you do have what you have is an Engineers workstation this actually is an IT device and typically this is going to be some old version version of Windows such as Windows XP I heard from someone not uh not myself but I heard from someone that I actually found an Engineers workstation that was running Windows 3.1 I also heard this other great story so there's this VP at some like $6 billion doll company and he knew how important those was engineering workstations would be to his operations he found somebody with uh a Windows XP laptop he paid the guy like with a sixpack of beer and ever since he got it he's left that XP laptop in the

desk in in the drawer in his desk and he's kept it there for years just in case the OT environments that he manages you know ever loses that that that particular laptop which I thought this guy was great because he's basically funding the implementation of his Disaster Recovery through through beer which you know I don't I don't ever get to do that okay so um all right this is what is known as a distributed control system okay uh sorry this is one OT system and at a site you might find multiple of of these that are all coord coordinating with each other in what is known as a DCS a distributor control system but in

some cases these OT systems are going to be spread over a large geographic area or into what is known as a scada supervisory uh and um supervisory control and data access and in those cases you're also going to find an rtu which relays from the PLC back to some sort of uh control center Mission Control type of place so this is a quick tour of what an OT system looks like what an OT environment looks like just keep in mind out in the field um might look a bit little bit different so next next let's take a look at how securing OT environments is going going to be different than securing it environments so in it we care about restricting

access to data moving data making sure that it's encrypted while it's it's moving and so on and so forth with OT you care about moving stuff widgets gears and and things like this how many people here have a phone that's older than 5 years old okay very few very few you know our it devices laptops phones they are are manufactured with planned obsolescence on the OT side you're looking at devices that have been in Commission operating for 20 30 years I even heard from one particular uh organization where they had a Time Horizon of 50 years meaning many of these devices are have been operating or will continue to operate and they're older than many of us in in

this room uh if if uh if you've been around for a little while you've heard of the CIA Triad or AIC Triad which whatever you call it right on the it side we we care about all those things but on the OT side there's there isn't really a Triad it's all about availability these organizations will do everything possible to avoid an outage and I want to dig into this a little bit deeper because this is going to be important later so imagine if this is a commercial organization right let's say it's oil gas every second that they're not moving oil and gas is uh Financial loss but it goes beyond that because many of these

OT environments are part of critical infrastructure and Key Resources they are also highly regulated so Colonial pipeline once had an outage and then fsma came in and then find them uh for uh a million dollars on top of their financial loss right so there's that component of of it going on if this is not a commercial organization but a governmental or quasi governmental organization you could imagine that perhaps there is some politician out there that wants to avoid the bad press of you know the Municipal Water Treatment Plant going down or something like that uh so for these reasons it's plain that availability is Paramount this is absolutely the most important thing in any OT

environment on the it side we have these time sharing operating systems that we all know on the OT side you have real-time operating systems there are far far more of them now of course this has ramifications for security right so if I were an EDR vendor and what would I do well I write one version of my EDR for Linux another version for Mac another version for Windows forg BSD um but I can't really do that on on the OT side which of these 65 operating systems am am I going to write my EDR for what is my financial model to get a return on investment do I do do I take the effort to do like the top 10 the top

20 top 30 it's a very different uh it's a very different story in in terms of securing OT because of the variety of operating systems on the it side I'm sure you've heard of all these programming languages on the OT side they have completely different programming languages which uh and this is a sort of like an IDE uh for OT programming uh ramification here is many of the Innovations on the it side in terms of of secur Dev life cycle you know how we release how we engineer code how we QA code doesn't really translate to the OT side I'll talk about this a little bit more in a bit on the it side we're all familiar

with Microsoft's Patch Tuesday on the OT side it's like patch September or like patch never why because these organiz gations want to avoid an outage they do not want to take any sort of outage for any sort of updates or security patches and they absolutely want to avoid any sort of extended outage that might come from a a bad update all right insecure by Design to this day you still find a lot of OT devices that do not require authentication that do not encrypt their traffic all right security controls and Covenant there are a lot of security controls that are available on the it side right unlike 20 or 30 years ago where you know most devices didn't even

have like antivirus on them uh that's still the case it's almost as if on the OT side you get taken back in time like Back to the Future like 30 years there are no security controls in many cases and there's also no security governance you will find very often a a lot of these devices have default usernames default passwords default ports that they listen on default configurations all right and this one this one's going to be key right it devices are connected to networks these days Wireless wired or what have you indirectly connected to the internet somehow or directly in the past OT environments were always air gapped if you want to compromise the device you had to walk up to it and do

something to it not foolproof of course of course right stuck net was was uh was done through a USB USB drive right but for the most part people thought you know we're safe all these other things up here that make this environment insecure it's fine because you had to walk up to it but here's the thing around 2005 or so things changed these en environments started getting connected to networks why would they do that well again the business wins there are operational efficiencies that come from connecting OT environments to to the rest of the network so imagine if I have a valve that's out in the middle of nowhere do I want to fly somebody out

there out into the middle of nowhere just to turn a valve or would it be better for me if I could just back in you know headquarters push a button to operate that valve right clearly there are operational efficiencies for connecting OT environments into the network you save money you save time you save you know people flying around and so on and so forth but the thing is this whole security through isolation came down that curtain of air gness came down as as one of the ramifications for this operational efficiency so starting around 2005 or so and it's been a continuing Trend you now have this situation where all of these other things that made these OT

environments insecure have been laid bare to the adversary over the Internet is anybody scared yet no okay all right so those of you who are a little bit form familiar with uh OT might say but Huxley there's this Purdue model let's like we do the Purdue model and everything's fine so this is this is the Purdue model okay so the idea here is you stratify your risk by uh diving up your various devices into these layers um so you can see here layer zero those are those uh Those sensors and actuators uh layer one this is this is uh uh these are your plc's over here this is where like IPS start showing up uh two and three you got your hmis and

then all the way at the top you have some other stuff up there right um so two oh one other thing uh with with Purdue the idea here is each layer can only communicate with an adjacent layer so if you're in layer one you can only communicate with two or zero you should not be able to jump right the other part of it is between layers you should have some sort of control that adjudicates communication across the board across that boundary so this could be a physical control like being air gaped uh or some sort of network control IPS firewall and so on so forth all right so so two things here yes there's stratification of risk

here but here's the thing right on the it side you'll notice on layer five there it looks very iish we start seeing a lot of it devices whereas down to the bottom we see more OT devices right and as we talked about OT devices are far less secure than it devices so what this means is if you're able to infiltrate at the top it's a foregone cusion that you're going to make it down because each layer you come down get gets easier and easier and easier all right the other thing here is that this is often times a myth very very very few organizations actually fully Express Purdue in the way it's supposed to in fact here's an

example remember plcs are supposed to be in layer one supposedly you have to go from 5 4 32 one to get to a PLC but guess what through showan or Google you can easily find plc's that are directly connected to the internet and remember what I said about security governance or lack thereof a default username or password will probably get you in there default usernames and passwords that you can find in GitHub oh and if they for some reason decided to change the username password remember I said these devices are often never patched so you can go to ca's website you know look up the K Kev for that device and you probably can use it to exploit that

device you don't have to do anything hard or different because there are Metasploit modules to do that all right so um some of you might might have like an unfinished bunker that you're building in your house yes you should actually go absolutely go uh take a look at that or you know if you have a plan for going off the grid I wouldn't blame you um and I was being a little fous earlier by saying oh you know uh there there are you know financial reasons why all these organizations care about uh avoiding outages but that in no way discounts the importance of availability of these environments right these are the environments that that manufacture our

our Pharmaceuticals make sure that we have electricity in our house clean water that we can drink and so on and so forth so for those of us who are engaged in the defense of OT environments what do we do well one of the first things you want to do of course CIS control number one is to figure out what you have figure out what it is that you need to go defend and protect and historically um in the past when organizations have attempted to to active scanning of OT environments things have gone bad right uh outages Financial loss and so on and so forth many of them uh bordering on catastrophic but we don't hear about it

because typically the organization will say oh it was a mechanical failure or something like that I heard from somebody that the 2003 outage on the Eastern Seaboard was because of an active scan but of course it's never been confirmed so don't quote me on that um so for that reason many organizations rely on pass passive uh passive Discovery to inventory their OT environments so let's take a look at that what does that mean how does that work well if you want to set up a Spam Port from a single switch out to this collector of that Network traffic for Passive Discovery that's very easy right three lines of iOS and you're good to go and typically when you do a POC you

know that this is this is what you do but how many of you work at a place that has a single switch any anybody just one switch no okay yeah typically what's going to happen is you're going to have multiple interconnected switches and well then what do you do well you could have all these switches go directly to the collector that's that's problematic typically what you're going to end up doing is using one of these Protocols of course you're going to hope and pray that all these switches support that protocol and that version of that protocol and they're all the same right right uh another another option is to deploy more more collectors you could do

that too but that's sort of trading one problem for another problem and what if what if you're actually working with a skada system now you have many many interconnected switches that are geographically geographically dispersed I submit to you that it is damn near impossible to get a full acid inventory using this model because you can't get to enough choke points on the network to make sure that you're getting everything and many many organizations with all the best of intentions after spending months and even years still do not have a comprehensive or accurate asset inventory the deployment is very complex as I just Illustrated the performance is poor unless you have invested in these really beefy Hardware Appliances to

collect all the traffic and for all that cost and all that effort what do you get you get an inventory that's missing devices and because you're limited to the network traffic that's going across the wire you don't actually interrogate those devices your fingerprinting is also going to be inaccurate or at least vague but organizations are willing to put up with all these things just to avoid out isues so here's the novel part of this talk why don't we take a look at active scanning again and understand why it has failed in the past and that actually becomes the five principles of active scanning and OT so first principle first principle is to always send standard packets and

expected payloads take a look at uh packet 2053 here notice how the fin bit the push bit and the Urgent bit are on this is not something that um any application would send as a normal course of network communication this is the type of packet that you would get from a legacy scanner uh something that uses cfp for fingerprinting uh or or even map they purposely send this type of traffic to see what kind of response comes back from that device and based on the response fingerprint that device and this works really really good for it devices or any device that might have been you know built in The Last 5 Years with OT devices on on the other

hand they will crash or they will reboot or they will freeze up why is this well remember what I said earlier all this innovation in software engineering release engineering quality assurance that has happened on the it side in the last 30 years has not translated over the network sack on these devices are not very popular because there's so many different types of them they're archaic they're not popular so they're not well tested and also testing of an OT device is more like if I push this button does it do that it does okay then we're done nobody's out there doing QA on these devices to make sure that they can handle arbitrary traffic that's true of the network stack

on these devices but also the applications themselves as well they're not tested for robustness and so therefore they are prone to disruption when they see something like this number two security probes security probes very much like on the on the last principle is is in and of itself uh unexpected traffic these devices are not built to handle a vul check and so when it sees it or when it happens they will they will they will be disrupted they will freeze they will crash they will reboot otherwise behaving IR radically all right the third one here got to give you a little bit of background so we have this uh mission control back over here and out in the

middle of nowhere is this pipe right this out in the middle of nowhere is so far out that you're not going to get FiOS you're not going to get DSL but instead you know they have a phone line so they actually set up an old school modem I 56k or whatever right so very slow link and there are some cases for some of these organizations where they have a particular site that's even further out where you can't even get a phone line so they use other means to then get to the closest site and then relay off of the other really slow link generally speaking some of these OT devices are very low powerered so they

can't handle a massive amount of traffic all at once right they'll just crash right remember it's it's a real-time operating system not a time sharing operating system so too much of it bad things can happen but what we have is is here is a situation where it's not just the device itself not just the endpoint that can't handle lot of traffic the network itself can't handle a lot of traffic and this is very common with OT environments so what you need to do is have the ability to uh do two things one is you need to be able to tune the number of packet uh per second you're going to send but number two you want to

be able to distribute that scan traffic across all the endpoints you're trying to uh trying to inventory and so the idea here is rather than sending all this traffic to him all at once I'm going to send you one packet send you one packet send you one packet and then I'll come back to it in so doing you're getting maximum coverage without overwhelming any sort of singal endpoint but at the same time you're keeping your your overall scan times uh to a minimum so that's number three number four this one extremely important with OT devices many times you will find some that even when you send standards compant traffic right going back to principle number one even when you send

standards complying traffic the damn thing still crashes right it it's just it's just poorly written like the software on there is just is just not good it's not robust it really is meant to like respond to somebody flipping a switch or or pushing a button so in this case the strategy is to say okay we're not going to do the cfp thing which is to send all these different queries to that one device in order to get a understanding of what in order to fingerprint it all at once instead what we're going to do is we're going to send that device one packet a very super benign query just to say you know metaphorically like get the shape of

that device and then when you get that response say okay I know this is this and so therefore I I can use this code path of queries I'm going to avoid this code path of queries and iteratively sending successive queries to that particular device in order to fill in the the shape that you fingerprinted earlier right not 100% foolproof right but uh it's worked out very well in the past so sending iterative queries to a device in order to fill in the fingerprint uh over uh over the scan scan period and the last principle here the last principle here is probably the most important probably but probably like the least well implemented because it's got

nothing to do with code it's it's all about people right um test and scan over time also known as don't be stupid right so you you want to identify the sites your OT sites you want to identify to see if there's commonality or see if like just visually you can find any sort of um overall categorization of those devices and then to attack them uh build that inventory slowly and small and then build it out over time so those are the five principles of of active scanning and OT which um you know as I mentioned before is a is a minority opinion here um it's not something that people do very often but but it can work and it

has worked and I'll leave you with this parting thought so there was a time where you followed all of these rules but these days you you get an Uber uh those of you with Teslas might have tried the the self-driving I think most of you have probably bought Bitcoin or ethereum at some point and you've all worked from home at this point and so I challenge you maybe it's time to start thinking about active scanning for your OT environments as well all right folks that's it for me connect with me in all these ways if you like and I will be out in the hall over there if anybody has any questions

[ feedback ]