
uh thank you very much for introducing me hello everyone my name is alina i am a software engineer at eop in london i do infrastructure security for those of you who don't know about yelp um it is a website and mobile app that connects people with great local businesses um job is also in spain luckily and uh yeah so anything from you can find from restaurants bars and spas to dog groomers mechanics and even dentists so not only food related places and uh if you're new to yelp or you haven't already done so you might want to check out the app for some great reviews nearby and barcelona is a really great place for very good food
cool this is the agenda for today i'm going to talk about shifting security to the left and introducing it as early as possible into the development process and then i will talk about how our ci cd so continuous integration continuous deployment pipeline pipelines look like um what kind of security tests are we have in place at this level and um how we do other for failures and how we track them and uh and fix them of course and we should also have some time for questions um at the end yeah i want to spend a little bit of time talking about yeah this shifts to the left paradigm which is one of the reasons we've put in
place all those tests that i'm going to go through uh very soon um security should should be actually everyone's responsibility not only security security teams responsibility from when the code is designed developed and tested until it's deployed to production and after that so the aim is to have it as early as possible into the development life cycle and at all stages by having a more proactive approach rather than only reacting when something bad happens automation is basically key for all of these because we have like thousands of services it will not scale um and this should also increase the collaboration between teams so we'll not have silos between the development and operation in the security teams
and uh the latter will be able to assist fix any security issues throughout the process rather than being a blocker against um go into production because it was involved too late into the process and this is what happens in the picture uh yeah it's from from twitter um i coated the author but i think it's not visible it's like when it's in gray on the left corner so it's not mine uh and last but not least uh it's a lot cheaper to find security problems uh during the internal development rather than when the service or the code is already in production and the company's money and reputation could be effective in case of a breach
we do have thousands of services at job they run in docker obviously they run on our platform as a service that we built in house it's called pasta it has been around for a while you can find it on our public github and if you're curious about the technologies um behind it we use docker for containers mesos for scheduling work oh maybe oh cool yeah mezzos a marathon this is the mesosphere logo i think a smart stack from airbnb it's an open open source project of them for service registration and discovery uh then since you seem to for monitoring and alerting i will come back to it at the end and of course uh jenkins where the cicd
pipelines are run this is how a simple pipeline looks like so from a configuration repository um we jenkins orchestrates the building and deployment of our of our services and uh particularly we use pipelines of sequential steps that watch for pushes to get repositories then they build the application artifacts and run different tests including some performance tests and roll the code through different development and stage environments before pushing the application to production to be run on pasta which i've just mentioned and one step in this pipeline is the security check step the the one in the middle that i'm going to discuss in detail next it consists of a set of tests they are written in python
that very high level they check the security status of the service um and it's quite handy to find out with every build uh how a service is doing security wise so whenever anything changes or there's anything different the owner team gets an alert and they can immediately have a look to fix any any problems so we find this is quite helpful now if you remember this diagram from a few slides ago um the past the security check is basically at the build level okay should be big enough now if you want to go even more to the left we can do so at the design level by adding a security section into the design document template if we were like
a privacy consideration if we already uh have that like to ask for the security aspect of the new code or the development phase was secured by default libraries um pre-camille hawks [Music] that do different like static analysis or of course in code reviews with our peers and uh yeah even after everything is deployed with um pen tests we can run back bounty programs um put in place different like scanners or do even more static analysis [Music] i'm not going to walk through the six seven security tests that i run every time that every time every service is built and what you will see on the very bottom is the output from jenkins basically what engineers can see
everyone can see who has access to jenkins um first of all we check if the latest debian packages are installed against um upstream repositories so to do that we pull the docker image of the service run the container and inside the container we do an apt-get update followed by an apt-get dist upgrade um to see if any new packages are available minus q key is super quiet so we don't want the output to be too verbose and minus minus just sprint is a kind of a drive run mode so no changes will actually be performed um we want to to see the output and decide what to do yeah because this upgrade is quite aggressive so not only updates
existing packages but could also try to like remove [Music] packages which are not uh needed anymore or install new ones to um to cope with like dependencies so we want to to vet that and uh for every new version which is found it will write to the console the current version in the version we should upgrade to um similarly you could what you can see so package systemd needs to be updated from something at 4 to something.5 right then we want to check back docker best practices and first of all we want the container to not run is user root because that's not the least privilege it's not following the principle of least privilege and unfortunately by default this is
this is a default behavior so if you don't have a user statement into the docker file the container will run as user route obviously very bad um now i remember a tweet from aws reinvent conference you know that big amazon conference in las vegas from last year which was saying that 86 percent of the images and public images from docker hub they don't have a user statement so that is a lot in such an easy thing to do [Music] moreover we want the service to use docker images that we maintain so we do know what they do contain and uh of course latest images not we don't want older versions related to the previous test people
could pin debian packages um into the docker file behavior that we want to discourage by failing the latest packages check and mentioning about the pinning into the the jenkins log and the last one is very interesting it looks like a minor thing dot get should be added to dot docker ignore by doing so we make sure that that gear will not end up into the final docker image of the service which basically translates into don't store information about your git repository in production because even though your web server doesn't have directly listing enable a diligent attacker will still be able to download the entire code and look for more severe issues so we should pay attention to like small
details like this in terms of well-known vulnerabilities one very specific that we are looking for is shell shock it's about vulnerability it's also super easy to check under 15 lines of code um it's a really nice quick win um the vulnerability is about being able to unintentionally run arbitrary commands from environment variables and um if the attack is successful the yeah the attacker sorry if the exploit is successful the attacker could be is able to take over the machine remotely and have full control so not that good at all um and uh yeah we hope to extend the list in the future and look for other common vulnerabilities like similar to this one then we do code dependency check for a few
code bases but the idea is very similar so for example for node um there is npmjs.com slash advisories which is a database of vulnerable packages the vulnerability they suffer from and the version we should we should use instead if it was passed so not all of them are patched and what we do we parse the yarn.log file um which contains all the packages with all the verses that the application is using and for every package we check if there is on the database of vulnerabilities and for python in a very similar way we parse the requirements.txt file which again contains the packages and the versions used and check against the database of vulnerabilities there's also a project called safety
which does everything for you so you don't have to parse anything you don't have to check um yeah basically that's everything for you um and the database is updated every month for the free version and in real time a lot more convenient of course for the paid version
image valuability scanning and the next one we would currently do stand alone and i will explain a second why um let's very honest um they are not part of the security just run by jenkins but um they are very great candidates to be integrated um so what we do we scan our base docker images with claire from coreos it's open source as well and that can provide a list of high medium low and negligible common vulnerabilities and exposures or cves which are found on which packages and the version we should upgrade to uh in order to get a patch and the reason this is not part of the jenkins pipeline is because it takes quite a few minutes like five to ten
minutes um to run and we don't want to block the pipeline um for so long that's a long time for like a developer to like uh wait 10 minutes maybe there are some other stuff running in the pipeline um you have to wait such a long time to see the result um if there is one thing you will remember from this presentation oh it's probably this one if you are not scanning base locker images you should start doing so and when you decide to upgrade packages um don't patch running containers so don't decision the container into the upgrade manually because not normally it will not scale but next time you'll spin up when you continue uh the old
uh the image will still have the old version so we didn't solve actually the problem um to rebuild the base image was the latest or like a desired version [Music] and lastly we want to detect and prevent high entropy strings from entering the code base we call basis by assuming the existing code has no secrets and only checking the the new code and to do so we wrote our own um python code which is called detect secret it's on our github as well it's loosely based on truffle hog which is a secret scanner from github open source 2. and right now what we do for a service we try to detect if any secrets are about to be pushed to the
repositories as a pre-commit hook so um but doing so when the service is real so i think actually this is even easier to integrate um doing so when the service is built as part of the security check step is a good way to double check that the developers didn't try to bypass the pre-commit hook requirements because you can skip them very well um and yeah one problem that the security check solves is uh creating tickets to track failures that need to be fixed once it has failed an email will be sent to the owner team and a ticket in our case the jira ticket will be created in their in their project via sensu which is also smart enough to not create
multiple tickets for consecutive failures and to mark a ticket as done or like solved when transitioning from a red to a green security check so that's really neat um and what i want to emphasize here is when the security check fails um the engineer can um check the logs um there is a run book run sorry a run book is like a short link um just in the output so you don't have to remember anything it's um really convenient and yeah not only you'll understand what the problem is and why is it a problem but also what you need to do in order to fix it rather than you are developing delegating this to the security team so from our point of
view this is awesome um and yeah to recap we discussed about the shifting to the left approach why is it good and what the benefits are then we cover security checks that are run every time the service is built as part of the ci cd pipeline well in our cases jenkins but could be the cicd of your trace and very important that by shifting to the left and having in place such this service developer service owners are more aware of the security of their service and uh more involved in keeping them safe and yeah i'm now happy to take any questions
[Applause]
thank you very much how do you deal with false positives yes these tests are pretty strong so it's not they're not flaky um so it's not like one test fail but we don't actually have that problem even with uh dependency scanning because even if you are using a library that has some cves then you still don't necessarily uh are vulnerable to the problem red the cv yeah right um so for the scan results we have a look and uh decide whether yeah we we basically an engineer goes through the the result and decides whether we should do something or just say yeah that's fine okay thank you
[Music]
so uh when creating the tickets how do you know who you should assign this ticket to to work on this yeah that's a very good question so there's a configuration repository for these pipelines
actually no scratch that each repository has a owner's file where we know the owner team and some engineers which have worked on that so since you will check that file and there's another mapping between the team and the jira project and yeah it will go this way thanks
i cannot hear you salty ticketing is actually automated as well yeah since you don't say [Music] what about um do you have like a timelines like due dates for each severity like for example this blocker should be a couple days or a couple hours to fix it then it was like major or minor initial martin is it as well advising the tickets or yeah this is a very good question so right now um even if there are any tests that fail the file by continues but the email and the tickets they they are created their the emails are sent um we are right now working on actually blocking uh if one or yeah if some
more more important tests are failing the pipeline stops and you cannot continue unless you fix that but we haven't decided on which one should be the test that actually stopped the pipeline but yeah this is what we what the plan is have you checked all the solutions rather than clear that can be integrated that might be faster or maybe running that on a nightly basis and then having that information ready for yeah i'm aware that there are other options for container scanning when we put this in place some time ago um claire was one of the first things that we tried and it works and it's run by a crown job so yeah it would
be quite easy to switch from weekly to daily um but yeah since it was in place and it was working fine we didn't evaluate other other options but i am aware that they do exist so what we do is the uh we go why which is the repository for the containers and that performs a scan automatically when you push a new image to there and then we use the api to collect the vulnerabilities from choi so that means that we don't need to actually be running and declare on the on the pipeline but just consuming the probabilities from from quite like from the api oh i will i will check it out thank you [Music]
oh there aren't any other questions thank you very much