Building A Security Program For SaaS Product Development

Name: Building A Security Program For SaaS Product Development
Uploaded: 2022-05-19
Duration: 30 min 43 s
Description: Image the following fictitious scenario: you are starting a new job as the first security engineer of a startup with a software-as-a-service (or platform-as-a-service) offering built on top of well known public cloud platforms with cloud-native technology. Being the first person to tackle security

BSides Munich · 202230:43228 viewsPublished 2022-05Watch on YouTube ↗

Speakers

Christian Bauer

Tags

CategoryTechnical

StyleTalk

Mentioned in this talk

Tools used

Argo Workflows Cloud Custodian

Platforms

Google BigQuery Kubernetes

Service

Amazon S3

About this talk

Image the following fictitious scenario: you are starting a new job as the first security engineer of a startup with a software-as-a-service (or platform-as-a-service) offering built on top of well known public cloud platforms with cloud-native technology. Being the first person to tackle security as a full time job, this might seem light a monumental task. How to quickly get a first overview of the current security posture? Where to start with security improvements? How to prioritize? How to define a security roadmap? This talk will provide an overview on how to introduce security into a typical cloud based product from the ground up. Short-, medium- and long-term security activities will be discussed, with specific proposals what high impact topics should be addressed in the beginning. We will cover a broad range, from technical topics, s.a. tooling for security automation, all the way to non-technical topics such as compliance. Speaker Christian Bauer Security engineer with a special focus on Cloud security.

Show transcript [en]

well thank you for an introduction so as the name implies here this is going to be about uh how to secure cloud-based product developments but before we start maybe a few words about myself so i've been working in software industry for a few years now i started out as a software engineer originally but then transitioned over into a security role and i'm currently working with a security engineer securing an as a service platform which is currently running onto cloud-based platforms and this talk is a kind of a summary of the things i've learned over the years both in my current job but also in my previous shopping i've seen quite a lot of different cloud-based products

okay so imagine you're a security engineer and you're starting a new job right today you're working for the startup called acme and you've hired specifically to build up their cloud security program that means you're the first person who really thinks about security from a holistic perspective and hopefully who knows what you're supposed or what they are supposed to be doing and when we are talking about product well their product is what you seize very often these days on the one hand it's an asset service offering can be platform as a service or software as a service it's running on those usually well-known cloud provider platforms and their technology stack is or something you see very often these days

so the applications are packaged into containers the containers are being deployed and executed by kubernetes and you have some additional cloud service around it that you need for compute instances for storage services and so on so this is another thing to do let's see unusual that you can see in cloud space these days so the question is where do you start right security is a huge topic and you can't do everything on day one you have to somehow prioritize and think about what can you do the beginning and what can you do later that means in the beginning we want to go for some easy wins right things which are so called low hanging fruits but at

the same time have a high secured impact in terms of improving the security posture of the products so how can we do that and that's basically what this talk is about how can we define a security roadmap that somehow starts in a relatively easy way and that scales up over time so security activities that i'm going to present on the next slide are grouped into different phases first one is like the absolute basics that really everyone should start with and then starting from phase two we are going to have more topics that might only pay off in the medium term whether something should be in phase two or three or four is a little bit let's

say it's always biased it depends what's your background right for people here who are have a web application security background they might complain hey there should be more web application security in there i've done a lot of work on the infrastructure space so obviously it's maybe a little bit more biased towards infrastructure it also depends what's the current security posture of a company and so on so long story short about phase one should really apply to everyone afterwards it depends what's the best approach for you but it's more like to give you some inspiration all right so what are the security phases i have to find four phases and if you do all of those

you already have achieved a quite good scooter back uh posture doesn't mean you're completed but you have i think some good achievements phrase one at the very top let's say the basics i'm going to talk about those in detail phase two these are already some additional topics i will briefly cover them and unfortunately regarding phase three and four i don't have enough time today to talk about that but i have backup slides if anyone should be interested the idea is phrase when i say the basics phase two is some topics that are again easy wins but at the same time should pay off on the medium term starting from phase three we have some topics that

really either need a lot of effort or are only going to pay off on the long term especially for phase four so i say that i don't have much time so let's get started voice one and this is actually the very first task that you have to do at the beginning so remember you are coming in it's your first day and you have no idea what you're dealing with you don't know how to secure something if you don't know what you have the first idea is you have to get an overview of a cloud environment which cloud providers are using how many accounts do you have do you have a segregation within production non-production environments

and most importantly you need access to those accounts and start looking into them what kind of infrastructure is deployed in those accounts what kind of services are being used and most importantly how are those services configured from a security perspective we'll talk about that a little bit later if you're only dealing with a handful of accounts when any others has caused it accounts on google how it's called projects but it's basically always a cloud environment you will easily run in this capability problem if you have to deal with dozens or maybe even hundreds of accounts you can go over them manually right you can click login into every individual account start clicking right on the dashboard

that's not going to scale so you need some form of automation and there's a lot of tools out there that can really help you with that and these are all open source tools that i've listed here and that list is not even complete which tool is the best one for you depends on what you prefer working with and also how many cloud platforms you have to cover some are multi-cloud some are only supporting one individual cloud firm in particular aws because it's the most widely used one it also depends how you want to handle the tool and what kind of output are you fine with reading json output or do you need some double outputs do you want to

process this further either way i personally have used steamweb a lot because it's super flexible and supports a lot of different platforms including software source platforms like cloudflare or github but it is you can use that to automatically obtain data from your environments and specifically kind of select what kind of information you are looking for what kind of misconfigurations maybe that gives you a basic idea on how serious security was taken the beginning right if you already see some bad things there you know oh this is not a good start otherwise if it's looking roughly good you know okay i don't have to start from the beginning either way so i've used this tool to

collect a lot of information and then have to start looking for this information and the very first thing that i usually want to do there that's step two now parameter protection and in the cloud we have two parameters the first parameter is identity and access management and i think this is a big differentiator of cloud compared to on-premise systems because the cloud provider api is the thing that's used to access your cloud environments that's reachable over the internet to everyone the only thing that keeps you from getting breached on the cloud provider level is to keep your credentials secure that's why it's so important to really look at what kind of credentials are currently being used

so we have to differentiate here between human users and service accounts human user and we know you have a form or a login form you log in with a password maybe then you get in that's what at least enable the s supported actually you have console login with a password or if you need api programmatic access you have an access key both are long-term credentials that do not expire and you don't have to rotate them which means that's a problem actually if you look into the publicly known security incidents of aws customers very often the initial act access vector for the attacker was a leaked access key like somebody committing it to public github repo

so you have to somehow get rid of those access keys or more generally long-term secrets so for humans there's no need to use it anymore because you can set up single sign-on the idea is you use singles and on and then your human user can obtain a short-lived token to access the cloud provider interfaces including the api and as we're talking about signals and on we should obviously be using multi-factor authentication as well these days and particularly for the production environment environment i think it's worth investing in the physical authentication tokens as well like free to do which i think are very cost efficient now single sign-on is not going to work with service accounts right we need

something else there i see quite often that you have an application that's running on for example google cloud and needs access to the google cloud api that means your application is going to either credential bible summits use those long term secrets just for that but you don't have to do that anymore because the better way is to obtain a short lived token from a cloud provider made a data service so in the case of for example aws if you're running applications inside kubernetes you have a feature called im rules for service accounts which means you sign an i role to your application that means the application can get a short lived token from the cloud provider metadata service there is

no long-term secret involved anymore those tokens usually only have a lifetime order of magnitudes of hours it's the same on google cloud the feature there if they're using their managed kubernetes service is a workload identity which works basically the same we have application running inside google's managed kubernetes you assign a role to it and then you can obtain a short-lived document from a metadata service so that works very well if your application needs access to the cloud provider api of the same platform things a little bit more complicated for cross-cloud provider access like application running google cloud needs access to ews api or the other way around until maybe two years ago the only solution was you generate an access key

in aws you export it and basically import it into your application google cloud which means you have a long-term secret that you can leak again luckily you don't have to use that anymore because you can now use identity federation for service accounts so if i continue this example my application running google cloud obtains the token from google cloud as described here then this application starts calling over to aws and authenticates with the google cloud token which is accepted by aws and they will issue an amazon token for that now application can use that amazon dukel to access the api and all of those tokens are again short left and this works because on the aws side i

will set up saying hey if somebody shows up with a google token that was issued by google and is assigned a particular identity at google then this person can now get an aws token meaning that i've completely eliminated the need for long term secrets on both aws and gcp on the same works with other cloud platforms as well which means means i have a two-story the risk of leaking long-term keys which is often a problem for many people or companies so that was about access keys or credentials another issue i see very often in the beginning people i mean you have to sign i am privileged right you have a token that's one thing but you also need to define what can

actually access with that token and people often then assign a lot of privileges right because it makes life easier for them so it makes a lot of sense to look at all those iron poses and basically see if you find these kind of situations where people like use white cards that means can do anything to cut them down so this is basically least privileged principle that you have to review there was the i am parameter the second parameter that's basically the same method we have in classical non-cloud on-premise networks which is the network parameter so what i have to look for is there two people actually have a network architecture meaning a public dm set and

private zones where most stuff should be divided in the private zones and only the bare minimum should be in the public zoom if public copies reachable to and the second thing is firewall reviews that happens really often people start opening up ports to the internet internet means again do anyone and you have to look in particular for things like openmesses exports usually when you open up an snh board with internet it takes only a few hours and people will start brute forcing you then if a password like centos sent to us then you will be breached now what i've said is we're going to basically look at the output from the dual and go away manually right so today

everything might look fine maybe next week it's not fine anymore because somebody has already started opening up firewall rules or generated access keys so we cannot manually go over that every single day or even once a week because that's not going to scale again so we need some way of automatically monitoring our entire environment for this kind of security violations and the solution to that is called cloud security booster management which basically just a piece of software that's going to look into your cloud environments by calling the apis and checking if anything violates your predefined security policies like what i've said before right i you have access keysteer somebody has opened up the firewall rules somebody has a storage

system that is publicly reachable to anyone and there's different solutions for that how you can implement it option one is cloud provider service do you have everyone supports basically has a solution for that there are also third body offerings for that for example software service platforms basically you grant them read access to your cloud environment and they can monitor everything and then tell you or raise alerts if something goes wrong some also offer the tool itself so you just get a software package and you deploy and operate it inside your own environment just the advantage that you still have full control and only have to grant access to a third party to your environment or third option open source tools that

they can use to build basically your own cloud security poster management cloud custodian has been around for many years i think it was one of the first tools and it's very popular i think with some people steamboat that i mentioned in the beginning for inventory you can also use it for cloud security posture management although at some point depending how large environment is you might run into scalability problems but it is this allows us to monitor everything automatically once a violation is deducted you get an alert and you can run this maybe depending on the solution in almost real time or like once a day or once a week so next part sooner or later you're

going to have a security incident but it's minor mages different store but when you have that happen you have to start investigating and investigating means you have to go over your log files right because without that you can't really do a proper investigation what i found is the easiest way to do that to set up a central locator sync and to just dump all your log data in there just for storage at the beginning and the most cost efficient solution for that is to use one of the storage services like amazon s3 and so on because it's easy to set them up you can drop as much data in there as you want because this is kind of no limit and

they are very cost efficient as well you can use the sim but that can become very expensive depending on what kind of solution you are using because of the huge amount of data that you're dropping in there and the first kind of log data that we feed in there is going to be the operator logs if you have a breach on the cloud provider level that's that means the attacker is going to call the cloud provider apis and those codes are going to show up in the management api logs that's why it's important that you activate forwarding these api logs into your central locator thing you make sure nobody can mess around with that so you

need access control on that now now we are just storing data right at some point when you have security in this incident you might have to compare this data and depending on how large your environment is you are talking here about varying gigabytes of data 10 gigabyte maybe even 100 gigabytes so you're going to need some infrastructure for that but as i said in the beginning our scenario is you're the first and only security engineer you don't have the time to maintain and operate infrastructure for that so my recommendation is here you go serverless you have this uh services that data lake varying services like athena or bigquery same should exist for azure that allow you that they're very easy to

set up they don't need any maintenance at all and you can clear them all in a relatively cost efficient way over your entire data basically the more data you clear the more you have to pay but if you don't vary that often it makes sense to that it's not a lot of effort to set up right then incident response as i said when the internet happens you need to be prepared and it's good that you have to do that in advance right how does your basic process look like you go in the new store need to store evidence somewhere so you need a dedicated location for that make sense to track your incidents in the issue tracking system and you need

to find in advance who is your contact person for what topic right operations you have to talk to them if customer data is affected you have to reach out to your customers but you're not going to do it yourself you need someone from customer support for that so that's important to define who's to connect for what topic in the beginning before something happens so that by the time you have the incident you're ready to go and finally vulnerability management i think that's a very old topic it's not too different in cloud the only difference is we have vulnerability management on two levels you have to scan for vulnerabilities first on the container level which also includes our application

dependencies and on the host level meaning the host machines where our containers are being executed there are open source solutions available for that anchor and clear working very well primarily for containers host machines is a somewhat more complicated story i don't know any reliably available solution for that on the open source space for vulnerability scanning on the host machines but and one more thing cicd-based scanning makes a lot of sense yes but it's not sufficient right if i build a convenient image today as kenny today and has no vulnerabilities this assessment might not longer be true next week so any continuous vulnerability scanning that's something some people forget about that and identifying vulnerabilities is one thing

fixing them is another story we have to write on slas for that with engineering and agree on something like one week to get a critical or high vulnerability fixed or something like that well and that's already phase one this should give you a base standing that will take you at least weeks maybe a few months depending on how much incompany support you will get for that after that will continue with phase two and then the first thing is defining the security baseline as secure engineer usually know what you should do right but that's usually not clear engineering so it's very important to write it down define what are your expectations towards engineering for that and you

don't have to come up with this yourself there's a lot of good stuff out there in particular the cs benchmarks they have basically securely best practice not only for club providers but also for kubernetes for docker images and so on it makes sense to just adopt those standards once i've written down my baseline i need to set up the monitoring for that right to see if people are actually following my security baseline and this is where i can use the cloud security booster management because i have to implement my baseline basically in there to look for violations of my baseline then another easy win kubernetes or container deployments a lot of people when they run the

application inside containers those applications are running as root i've seen this so often i think it's a i don't know what's going on there but nobody's thinking about that there's a very easy way to override that in kubernetes you have a so-called security context object and there you have a parameter called run that's used allows you to overwrite that you just specify in your deployments no run this as an unprivileged user and then it's no longer running as root it's super easy to implement and there's a lot more security features in kubernetes that you can use finally or something that people sometimes miss if you have two applications running in the same kubernetes cluster there is no network isolation between

those applications application a can always reach application b and via verizon that's just the ways vanilla kubernetes works the way to fix that is to use so-called kubernetes network policies which are usually never enabled by default you have to explicitly enable them and start deploying network policies which means basically is a firewall rule that applies inside the kubernetes cluster then again as i said about using this run as user option how can you be sure that people are actually using that you have to again set up the monitoring for that that means you have to extend your cloud security booster management also monitor your kubernetes clusters to see what people are deploying there actually and

how a very different topic so security and cloud usually means you have a scalability problem that means you need to automate as much as possible for that you are usually going to need some tool right for example i want to run a dynamic application security testing once a week i want to run some board scans maybe every day same with credential scanning maybe so i need some platform that allows me to define what tool am i using how i'm going to call this tool how i'm going to process the output and where do i push these outputs then i need the time sketcher for that this is what workflow orchestration engines are very useful for again the

cloud provider services serverless to allow you to do that there's also two kubernetes frameworks for that which is algo and dactone i personally used a lot of aggro workflows but functionally it's almost the same as stacked on so it doesn't really make a difference or if you don't have that much secured animation of course you can just spin up a virtual machine with a crunch up on it that schedules everything a lot of this is reactive right it's how do we get things in a more proactive mode this is not even cloud specific you have to hook into the software development lifecycle that means i need to be involved in the design phase otherwise i will always be keep running

after people get things done and yeah the band is kind of obvious i think right and that's phase two and then in theory i can continue with phrase three and four depending on what i want to do and so on and how to secure the boost js now now before wrapping up some final thoughts so our startup sme is doing business dealing with business customers and there are some customers that would say i really like your service offering but i will only do business with you if you have a security certification like iso 107000 or soc2 or something like that now i'm not going to get in topic if those certifications are going to improve your

security posture but i think that's the wrong way of looking at you can use those programs to actually push for your security camera to get things done because if somebody asks you hey why do we have to do this you see well we need this for stock too and if you don't have stock 2 we will not win that customer so that will help you a lot because nobody is going to argue about bringing more revenue right so you've changed security of not being a business enabler which is making discussions a little bit easier in my experience yeah and finally that's my culture question how are you approaching security right my security is a very cross-functional role we have

to deal with a lot of people right you talk to customers about for incidents and as we're talking about product security we are dealing a lot with engineering you have to build a relationship with them and that means are you going to be guy guy who's just going to make trouble for them and go is going to blocking them all the time or are you more seen as a valuable support function because this can really influence a lot and how whether they are willing to work for few or not right and before i wrap up i'm not the first person to think about how to do this in structured way there's a lot of very

interesting material out there in terms of talks and also documents i highly recommend going for this if you are interested in this topic well and that's it [Applause] thank you very much christian for the interesting talk i think the one key takeaway is that as a security person you should not try to be the blocker but more like a valuable resource that's a very good recommendation i think um are there any questions from the audience please step forward and um

hi thank you great talk um i was wondering uh what is a good way to get more support by the engineering team and everybody in the company for getting a better security system that's a tough question well i think there was one of the slides that have a roof for time reasons i think you need management support if management doesn't care about security then things are going to be difficult and the other thing i think if you work with different teams there also people who are for security and some people who don't care it's important to identify basically those people who are interested in security because those are going to be their allies and i think these are basic kind of

security champions that they can try to push thank you hey a question about the last thing you you touched on on on security culture um yeah are they building relationships with other departments and not being a blocker sounds great until someone you hear that someone is trying to put an rdp password logon machine on on a public ipv4 address so where's the there's you have the the one side where you everybody here at least thinks they have to put the foot down but then you you kind of go against that how do you keep balance yeah so there's an interesting talk from netflix and their security culture says trust but verify their idea is the fd monitoring in place

if you if one team screws up the money will show it up immediately and then they go to those people hey people you have to fix this and you have dashboards that show how well teams are doing they're kind of playing the teams against each other saying oh this team is really doing well you're more like down there maybe it should really do something this kind of an intrinsic motivation for other teams that works for netflix i don't know if that works in other places too all right thanks um so actually regarding working with departments with our departments from my experience it helps so this also knows with others when you accept that they have problems or issues

when they when you take them serious from my perspective they take also serious if you see that you're competent most of the time what what uh developers and engineers think okay someone comes around and he wants something but we have other problems so that's just a remark i have two questions um first you talked about setting up a perimeter there's nowadays this thing called zero trust model and there are a lot of guys are saying oh we everything has to be secure with zero trust what do you think about that one would be a first question yeah so i mean it's possible right zero trust i mean i'd say the thing is um with uh with the perimeter you kinda so

if you had said in this later phase you remove the parameter and do everything for example in vpn or separate contain control and data plane then i would say okay for the first step parameters are good but when my experience especially still in cloud people are doing oh i'm inter in the trusted parameters so that's no problem anymore right well i mean i think it's not going to i'm not going to say this will still just remove the network pyramid you still have it right said in kubernetes inside this idea you are inside a trusted island then everything is good i think zero data for me is the opposite this means you means micro segmentation

and micro segmentation for me includes network parameter maybe on a smaller level but still it's there and i think the im is actually in cloud is very easy to do serial rust for human users you can say you allowed you can set up imports to say you log log in but only monday to friday and only from a particular p address range for example i think it cloud is maybe the easiest place to implement zero trust yeah the second question i was a little bit surprised that you did not talk when talking about to the developers about the supply chain stuff because especially i'm more of an infrastructure guy myself but the thing is uh with total with the packages

situation today in our days i would say okay we have really look where software where the developers go they're getting their software from and their libraries so um in which phase would you say is that later on it depends okay so well i mean if you had problems or in this direction you should pull this into phase three at least obviously right i mean i mean availability management is companies a little bit but not completely obviously but i'd say i probably put it into phase four okay cool thanks over time a bit already is it a quick one or thanks thanks again for the presentation so i'm wondering so here you're starting with the assumption that you have a full

access into the network into oh to the system visibility how much would it be different if you have a partial or a non-view you know if some part is outsourced or you're you're a part of the team that is actually doesn't have access to the entire system well if you have no few you have a problem right you know i mean you can still do the pyramid all right scan that but that's it if you have a partial few it depends i think a lot what do you have access to like do you have super access to copyright api or do you and that's it or do you also have to look have access into host machines and

things like that i think you should always go for acts as a security team because actually most club providers will give you a dedicated security read-only role just for that purpose it's called security auditor rule and you should always go for that i think if you don't have that then you can't do work i mean it's impossible cool great stuff um thank you very much christian thanks to the audience for the good questions [Music]

Building A Security Program For SaaS Product Development

Related talks