SBOM Challenges and How to Fix Them

Name: SBOM Challenges and How to Fix Them
Uploaded: 2022-09-04
Duration: 43 min 49 s
Description: Hossein Siadati and Trupti Shiralkar examine the practical challenges of generating and using software bill of materials (SBOMs) in securing open source supply chains. They explore tooling inconsistencies, unsupported build systems, and the gap between SBOM generation and actionable vulnerability ma

BSides Las Vegas · 202243:49135 viewsPublished 2022-09Watch on YouTube ↗

Speakers

Hossein Siadati Trupti Shiralkar

Tags

CategoryTechnical

TopicSupply Chain Security Vulnerability Research

StyleTalk

About this talk

Hossein Siadati and Trupti Shiralkar examine the practical challenges of generating and using software bill of materials (SBOMs) in securing open source supply chains. They explore tooling inconsistencies, unsupported build systems, and the gap between SBOM generation and actionable vulnerability management, then propose strategies for building organizational software upgrade policies and developer security education.

Show original YouTube description

CG - SBOM challenges and how to fix them! - Hossein Siadati, Trupti Shiralkar Common Ground @ 18:00 - 18:55 BSidesLV 2022 - Lucky 13 - 08/09/2022

Show transcript [en]

good evening everyone welcome to b-sides las vegas common grounds this track is software build materials challenges and how to fix them a few announcements before we begin sponsors we'd like to thank our sponsors especially our diamond sponsors lastpass and palo alto networks and our golds yeah very excited for the sponsors and our gold sponsors amazon invisium plex track intel google and blue cat it's their support along with our other sponsors donors and volunteers to make this event possible these talks are being streamed live and as a courtesy to our speakers and audience we ask that you check to make sure your cell phones are set to silent if you have a question um you can use

this microphone up here um just make sure they know if they're generous enough to let you ask um that way people can hear you on the stream and get the recording on youtube um as a reminder the besides photo policy prohibits taking pictures without the explicit permission of the people in the frame these talks are all being recorded and will be available in youtube in the future we would like you to keep your masks on at all times if you would and um looks like there's enough room for everyone so let's get it started here so all right guys thank you everyone good evening i can't believe all of you showed up for a 6 p.m talk

that's quite some commitment we will try to answer your questions as much as possible but our information will be displayed here and hussein and i will be available after the talk as well if you guys have any questions with that let's get started today we will be talking about what are software glove materials what kind of problems s-bomb can solve what are the uh challenges we have face uh while generating s-bombs and how to fix them so as all of you know open source software is eating the world right uh obviously there are a lot of advantages uh the cost less cost as well as speed of execution when we use open source software to uh you know design build ships new

services or tool now considering uh you know these statistics we have here if there is a critical vulnerability you can imagine the gravity and the impact at scale let's take a look at couple of them raise your hand if you remember 2017's equifax breach oh wow almost 60 percent so in 2017 open source uh component apache strut server suffered a file upload vulnerability uh through which remote code execution was possible and as a result during this breach almost 147 million users personal data was in america as well as in the us and according to ftc the settlement was around 425 million now let's take a look at our next example heartbleed raise your hand if

you remember this one this is little old almost 10 years old okay approximately 50 percent so in 2012 on a new year eve a german developer introduced a buffer over for vulnerability in heartbeat extension of openssl it was discovered almost after two years in april 2014 by google researchers and as we all know almost 60 percent sites and services were affected by that and it took all of us quite few weeks and months to clean up the mess who remembers lock 4g the most recent awesome almost 75 80 percent yeah so in lock 4j vulnerability which allowed remote code execution millions and millions of java based applications uh data store and devices they were vulnerable right

now those were the prime three examples but if you look at this cve data what you see in 2011 cvs score 9 and 10 these many vulnerabilities will where produce 832 and we are in 2022 right and take a look at the number wow now from this scary situation can s-bomb really save us to create remediation at scale that's the question we are going to answer in today's presentation so hi this is tripty um i am engineering manager for software security at datadog i'm a mobile game developer turned security professional so as a developer i truly can resonate with all the pain points a typical developer faces with respect to security and i'm always open for mentoring

coaching or interesting security and privacy conversation over virtual coffee or in-person coffee when i'm not doing security i like to exercise and that's me upside down doing aerial yoga hike i also like to conduct a meditation workshop because i believe in work-life balance i'm a certified meditation instructor as well this is my contact information feel free to add me in your linkedin or send me a email with your questions or inquiries with this i would like to introduce my esteemed colleague hussein hello everybody this is hossain si adati i'm a senior security engineer at datadog i have a phd in computer science from nyu i have a zugler i worked on software supply chain security at google

as well when i'm not doing security i'm i do hiking swimming and started surfing this is photoshop obviously but i aspire to be a good surfer thank you hussein so today's agenda is going to be we are going to talk about open source software security gaps and then hussein will introduce us the concept of s-bomb i'm sure many of you are already familiar but he will introduce in a very creative way uh for those uh for whom it's a new concept and then he's gonna talk about some of the toolings uh s-bomb uh use cases beyond improving open source software security and then he will focus on s-bomb challenges and what are some of the solution approaches we can use to

fix them i will also uh speak about some of the strategic initiative uh we as you know security professional can take in our respective organization to improve the state of open source software security now let's talk about what are the common open source software security gaps we see and why we see the number one gap i see is open source software developers don't necessarily have security education when they are at university if at all they opted for university education they either take one security class or they do not take security class that knowledge is not sufficient for them preventing introducing security flaws in our open source code second gap i have seen most commonly is

since last 10-15 years we kind of relied on software composition analysis tools and not necessarily s-bomb tools to fix our vulnerabilities and mostly these vulnerabilities kind of lack more detailed information on exploitability and whatnot when we just purely relied on sca tool the third gap i have seen is almost 50 percent of the organization do not have open source software security policy or standard rolled out what that does is every time there is a severe vulnerability like lock 4g or hard blade everybody loses sleep and they go start hunting what's the blast radius what's the impact let's go for upgrade if you are affected but we do not necessarily have a policy that can educate our

developer or create a culture of automatic software updates regardless of vulnerability and we are going to talk how sbom can helps to achieve that state as well last thing as a result of lack of education lack of adequate tooling lack of standard and policy we see immature processes to upgrade oss many times if os upgrades are already integrated in repository there is a chance that they can break the service and cause regression to avoid all these problems let's see our main motivation is to improve the state of open source software security and to do that it is extremely important to understand how we can leverage software bill of material and what are the use cases

traditionally we are quite familiar with s-bombs generated from source code but in today's talk hussein will be putting emphasis how we can generate s-bombs from different sources such as source code build time and run time and what are the unique advantages it can offer us to foster open source software security and lastly uh we would like to discuss some strategic initiative with that i would like to hand it over to usain thank you turkey so a warning before i go to the middle part section of the presentation and that would be a spam fatigue you have heard a lot of spam spam in the industry so but bear with me hopefully when we link spam concepts

together hopefully we can get something out of it so what is a spam swami stands for software below materials and i'm very happy that alex friedman he's in this meeting and he's the swarm guy who basically drives a lot of initiatives around this bomb and next week they're going to have a big group of people getting together to talk about what is next step on spam where we can take it uh but thank you so much for all the great work that you have done in the domain and this definition comes from the various through documentations that ntia has provided around this bomb and the definition goes as spam is a nested inventory for software

a list of ingredients that make up software components and if i want to draw some basically analogy in the domain of mechanical engineering this is not a new concept it's like 70 years like 1960 industrial engineering and mechanical engineering they have been having this sort of diagrams that they basically specify what are the components that are being used in an engine for example what is the shape what are the you know lengths what different aspect of that to be able to for example diagnose if there is any problem in an engine they go back and see you know which part was this what was the producer and to be able to you know fix the problem

identify and fix the problem the same goes for the food industry and chemical engineering for example there are you know customer phasing facing labels on any almost any uh thing that we used uh recently uh that says you know how much calorie this one has what are the most important you know materials that customers specifically want to know about um to be able to satisfy some of the use cases around for example if somebody has allergy they should know if there is something that they are allergic to but that definition of this bomb as it appears it sound that this bomb is only the list of dependencies or nested dependencies but i want to just put some emphasis

here that s bomb is not only the list of dependency its dependency plus some context so in addition to the list of dependent dependency the suggested list of you know baseline information would be author name for example supplier name component name version string component hash a unique identifier and relation of an object to other objects in most of the cases it would be you know including relation one component could include other component but the relation could be um something else also you can add as many um other contextual information around dependencies like licensing um you know time sam end of life or grouping whatever you you can add as much context and these information are very powerful

for you to satisfy the use cases uh that i will talk about so there are so many different use cases that we can imagine around this bomb so it's not only give me the list of dependencies it's gonna serve the company with different use cases there are tons of them this is based on the research that we have done within the company um but i'm gonna emphasize on two of them only one of them would be vulnerability management for discovering the vulnerabilities for example if you know that you have certain dependency to one specific open source project for short if that dependency is impacted by the recent cbe then you know that probably probably you are impacted but by that

but not necessarily the other one is something that is becoming big more and more important is the software supply chain security as i will describe s-bomb is not only one point of view of your software but it could serve you to show the chain of the uh and the workflow of uh your software so to give you a bit more context of you know how this spam is gonna surface from the technical point of view is that when you describe a piece of software um you're gonna have a form of description of that um it could be free form you can you know choose how you want to present those data but fortunately there are two major

standards one of them is pdx and the other one is psychologics spdics was created 2010 the most recent one is 2017. the the major use case of svdx was around you know compliance to just show what are the components that i'm providing for this software specifically if there was a piece of software that was used by an external company a third party company you had to provide the list of ingredients of your software but the cycle on dx is more of you know more recent use cases this one is a specific example of you know um a software with its dependency with different levels and these are the descriptions of you know sbdx and psychologics of that

but as i will describe um we shouldn't be worried about the formats because they are interchangeable uh and and there are tools to convert from one format to other format as truthy mentioned in the beginning of the talk you can create the spam or software bill of material in different stages of you know software creation basically software development lifecycle they could come from source code this is the most common case there are tools that they get the source code you run the command including 3v or other tools and they i have linked a basically when you have access to the slides you're going to see a big document including all the tooling that you can use mostly on the source code

there are also integration with for example ci cd so you can add github actions to your source code so as you push new source code to your repository these spam are gonna get generated automatically you can you know push them reuse them and generate them and there are build time tools for example microsoft spam generation tool that you can use to give you the build time software bill of material which is you know something different so for example in addition to the context concept of the context that i mentioned so not only the actual software that you have has dependencies but the build tool that you are using has dependencies so for example if that build tool is also

impacted by one specific vulnerability it might influence the final artifact that you generate so all of these are related and should be considered and the runtime dependency at the moment most of the full tools that provide application performance monitoring they have visibility to the components that your software is using and they will be able to generate sort of software bill of material for you but as i said like there are lots of hypes in industry around spam and people truly and genuinely started generating a spam from whatever software that they are using but there are challenges the first challenge is around tooling um so if you run two different tools for example on a one repository they're

gonna give you different results different number of dependencies and different you know number of them for example i ran um two tools 3b and cyclone digs format on gotof and there are different number of dependencies of course i mean part of it could be because some of these tools include test time test dependencies or whatnot uh but also if you exclude them you still see differences between the number and you know dependencies that they report so this is one of the main challenges and one aspect of it is that some of the tooling are noisy basically to generate too much dependencies that you are not actually using them um the other challenge uh so so for the first uh

uh basically item for the first challenge the recommendation is to go after something we call a basically correct tool for a specific language so for one specific ecosystem there could be some tooling that focus mostly on that specific language and they provide better quality swamp compared to some more general tooling for example from our experience cyclone dx gomod is providing better quality golang dependencies but eventually what industry is going to diverge is that each of these tooling like language specific tooling provide would be the base and then all the other tooling basically run a command for that specific language and they generate the spam so eventually the industry i believe that gonna converge to that point that

we don't see these discrepancies anymore the other challenge is unsupported build systems like for example we don't have any tooling for generating a spawn for bazel uh build projects bono repo is one of the common challenges that industries have because you know everybody follows google google have has this monorepo system so in many companies uh there are mono repos but the thing that happens in monorepos there is no specified boundary between the projects from the perspective of you know these tools that they generate as bomb so you end up running a spam in the root project which is a collection of projects and you get a big spawn and those tools they don't understand this aesthetic folder is a separate

server project um so they cannot provide a quality tool of course you can annotate your tooling and have a you know higher level tooling that uh help with that but at the moment uh it's not embedded in this bomb tooling uh the other challenge is limited supports for build time spam and i guess we have to wait and see how the industry goes in this direction to provide better support for build time spam supports and the last one which is not actually a challenge is most of the people are concerned about the formatting and i would say we shouldn't worry about the format because i mean they can be easily converted using the existing tools

the other challenge in the domain of spom is that you know understanding of you know we have generated all these response but how can we use it why how should we use it right because when we generate lots of data it adds to the com confusion unless we know how should we use it and one of the directions that the industry is going towards is to basically utilize a spam for example for the vulnerability management um and that is inclusion of you know some more context to a spam for example if you are familiar with there is a concept of vulnerability exchange uh that is an extra piece of information which leads to one the list of

vulnerabilities that your software is actually impacted by and if you basically uh couple that with the spam information or use a spam information to generate the x it's going to be a very powerful tooling so as tripty mentioned uh basically we have to think about how we want to use spam information in the context of open source software security and one aspect is automated software upgrade so when we generate sperm information automatically when we identify which specific vulnerability we are impacted with using the vex information we will be able to automatically use some tooling to automatically upgrade for example that dependency that we have if there is a new version that we want to fix um so

what i'm trying to say here is that a spam by itself is a collection a database of you know lists of dependencies but we have to put it in the context and and and copy it with different useful information including vx to be able to take a proper action um in uh in this slide that i mentioned about the um shortcoming of this bomb one was that one was accuracy of the information that they get from this bomb for example there were dependencies in the from the source that they weren't accurate enough so one approach to overcome the noisy information is to put information of a spawn from different stages of software for example if we

have a collection of spam generated from source code from build time and from runtime it's going to help us to reduce some of the noises for example from the source time we would see 10 dependency in the run time with vc for example 3 dependency so at least this means that those 7 extra dependencies that we see from the source time shouldn't be the focus or the highest priority uh if not they are false positive otherwise but the collection of these two pieces of information shows us like what should be the focus and what is the priority and also some piece of information that we get from the build time dependency is something that we don't get from

source and runtime because we don't know how this piece of software was built what was included in the build system to be able to find some of the vulnerabilities that the system has so um putting this piece of information together is going to help us to know which vulnerability is important when we see them in runtime they're the most important to know how should we fix them for example if you only have the runtime dependencies when you want to fix them you don't know what is the source code you can trace them back and see you know what happened during the sdlc that we ended up in this situation so the conclusion here in this slide is

that you know we need all these three pieces of information together uh to put them together to be able to have a usable uh basically a spam that is gonna be less noisy and can serve us in the use cases that we mentioned as an illustration of the last point um as you know datadog has an agent the agent is open source because customers have to install this piece of software on their infrastructure we have to provide that but at the time that you see that for example this bomb from the source code you see a bunch of dependencies dd tracer go proto buff with this version with this package url but what you don't see from the

from the source time is build time is spawn which is like the version of pythons that we are using pylint and of course there is a link in the build time that is gonna be referring to this s bomb but there are tons of other information that is not included in the source code and basically when you put these two pieces together um you're gonna see the basically 2d picture of this bomb if you add runtime you're going to see the 3d which are you know helping you to basically go through the use cases that i mentioned with that i'm going to hand it over to troop t thank you hussain uh just to put things

into perspective datadog agent has more than 1600 dependencies and for the sake of simplicity we only showed a snapshot the major difference between s-bomb generated from source code as well as from you know build time now let's talk about some of the other strategic initiative let's say using s-bomb generated from source build runtime we are getting accurate information is that sufficient to drive remediation at scale probably not so in order to do that in order to you know our s bombs to be really effective what other initiatives we can take to help with the situation of open source software security so the number one suggestion is start training your developers if you guys have programs like security

champions security ambassador program teach your developers about secure sdlc how to design features architecture from security point of view put special emphasis on secure coding guidelines secure ci cd teach them how to use basic security tools like static code analyzer when we publish articles don't we all use spell check why can't we do with the source code right let's make sure our developers use high quality static code analysis tools just like a spell check for whatever reason let's say you guys don't have your very own security champions program or developer education program then you can rely on some of the open source foundations that i have listed here first.org has excellent two hours long training on how to prioritize technical

vulnerabilities and severity open ssf has all these programs freely available for our open source software developers so that they can learn about these vulnerabilities and start preventing introducing these security flaws in our code last i would say that generate or create a culture of security so that developers feel empowered to own security from end to end when it comes to usability when it comes to scalability performance developers do care about these things right so when it comes to security it's not only security teams responsibility they should own it end to end and we should act like trusted security advisors that was about education now let's focus on what other things we can do so using s-bomb accurate detection is

awesome it reduces noise it gives us exact list of uh you know software upgrades that are necessary but that's a very reactive approach and when things like lock 4g or hard blade you know comes we want to use that reactive approach but as such that's not the approach which is scalable so we need to create an engineering culture where we can teach our engineers to get into the habit of doing automated software upgrades even if there are no uh vulnerabilities right we shouldn't wait for vulnerabilities to start doing upgrades and as i mentioned earlier almost 40 to 50 percent of organization do not have such policy that will enforce engineering to get into that habit

so we literally need to sit down with each and every engineering team understand their s bomb software composition analysis which components are high impact and that can cause regression versus which components are low impact that we can upgrade on the fly few times a week or few times a month get an agreement with them to you know frequency of upgrades and whatnot something that is high impact for one team may not be high impact for another team depending on the design and architecture we can find all this information only after having an honest talk with them so definitely uh beside besides generating s-bombs or rolling out fancy tool sit down with your engineering leadership and establish an automated

software upgrade program get that policy into place now let's talk about how we can go about building such policy so start with the licensing file or the s-bomb output of different products i understand this may not be a perfect list but that's a good starting point and as i mentioned earlier for each team for each product or major component create a list of high impact components that we cannot afford to upgrade every now and then maybe you know postgres is the high impact component that we can afford to upgrade only twice a year whereas some other libraries such as rubygem we can upgrade now and then and it's going to be different for each team

so get that agreement uh establish sla communicate that schedule as part of their sprint planning and quarterly planning so that they are well aware that this is coming and it's not a surprise you know the way vulnerability a severe vulnerability randomizes all of us we we all hate that right ha developing the habit of you know regular upgrades will make that randomization as minimum as possible and eventually create the culture of staying on top of your upgrades including end of life because end of life component they introduce operational risk when certain components are not supported we cannot rely on those that code anything can be a weakest link with that uh let me discuss some of the

key takeaways so open source software supply chain related security issues they are unavoidable they are not going anywhere but there is a ray of hope s-bomb generated from source code plus build plus runtime as hussein explain can definitely help to reduce that noise and improve the overall accuracy depending on the use case of s-bomb we can add context for example in case of vulnerability we can add vulnerability exploitability information to drive prioritization and other strategic initiatives such as building open source software security policy plus building developer security education program these are priceless now call for action if you don't remember anything from this talk i want you to remember these three things first do not rely on sbom generated just

from source code find tools and mechanisms that can help you combine the result of s-bomb generated from build source and run time second always think about the context such as vulnerability exchange information when the context is added to s-bomb it creates really valuable information and with the help of open source software policy you can improve the prioritization problem that most of the organization face with software upgrades and last call for action please please consider rolling out a developer security education to elevate the overall software open source software security state more our developers are empowered less security flaws they are going to introduce in code with that uh thank you so much besides for having us here and we are open for

questions [Applause]

so i just wonder if you have any examples of people that are using s-bombs consuming them and doing stuff with them because i know i've been asked for them and i asked to record format and i get answers like excel they've been told they need an s-bomb but they have no capability of actually using or understanding an s-prompt so we have good tools to generate them they may not be accurate yet but what are the tools to consume them and generate actionable you know intelligence and actions off of them so i i can't really think about the tooling that they provide this service uh but i guess there are platforms you can i mean eventually when you have this

bomb information you can yourself put them in the form of a database and make them queryable that would be the closest suggestion that i would make but but there are already um commercial tools that they try to you know make them searchable linkable and you know also when you have that in your database if you have extra information about you know anything like contextual information you can link them join them together to make them usable so i would say you know the first step would be to put them in form of a searchable database yeah

any other question please keep your mask down when you uh otherwise it becomes difficult to hear sure so uh let's say for example that we manage the problem of having everyone generate the s bond in a specific format now let's add that we work in an organization not with one product but with millions of products okay so that will generate a fatigue in the alerts a fatigue in the amount of risk to prioritize do you have any learnings on how to manage that increasing risk like taken into account in application security we're always going to be out proportioned by the amount of things to do that's again a good great question so when we have this collection of

spam from different software that our system is using how we want to basically prioritize the vulnerabilities that we see for example from the list of them and i would say like the same approach we have to i mean this is a great question to think of like a priority perspective of these dependencies i would say start from the runtime so they see the thing that you see in the runtime are dependencies that are you are actually using them so the list that initially the list that you get mostly are from the source time as well right that's usually what happens so if you have some tooling like application performance monitoring tools that they have visibility to actual

dependencies that you are using based off of this software uh that is a great starting point to see you know uh the source code it announces that there are you know 2000 dependencies and then in the runtime you only see 10 of them so those 10 dependencies are basically the greatest sr but then you have enough time to go over the list of you know vulnerabilities of the source code base that's the lowest hanging fruit but you can define different measures like for example recently google and some other developers they have provided the ranking and rating of the open source project with respect to software supply chain security so you can include those scores into the risk of each of these projects

and measure that into you know which one you want to start first uh to address yes but that's not a easy problem to solve overall yeah um and i would like to also add to what hussein mentioned um personally what has helped me is uh vulnerability exploitability exchange information so over period building a database of vex and only fixing those uh components which are severely impacted that's another good strategy to reduce uh you know the prioritization problem third thing i have done is we have to start somewhere right so if we can find top 20 open source components or libraries use across all the products and then provide you know guidance on how to securely consume those

libraries that's another great approach for example open ssl library it is used in pretty much every security product right or it is used for every um security feature such as encryption data address or encryption in transit tls then making sure our developers knows how to consume open ssl securely and teaching them how to make sure they don't use a open ssl version that is already susceptible to known cvs that's a huge win so yeah those are some of uh techniques i would use uh to drive prioritization when you have millions of products

all right any more questions comments insults going once twice price all right thank you everybody this means a lot to us thank you besides las vegas

SBOM Challenges and How to Fix Them

Related talks