← All talks

Linux Sessions for Monitoring and Reviewing Linux Workloads

BSides Vancouver · 202114:4673 viewsPublished 2021-06Watch on YouTube ↗
Speakers
Tags
Mentioned in this talk
About this talk
Mike Sample explains how Linux sessions are created across different contexts—SSH logins, container services, systemd, and tmux—and proposes a classification framework for monitoring them. The talk covers the Unix workload model, session fundamentals (user identity, timestamps, process relationships), and practical session categorizations to detect insider threats and application security risks.
Show original YouTube description
BSides Vancouver 2021 Linux sessions are more than what you see in your terminal when you login via SSH. Sessions are also created by terminal and console logins as well as services started in containers, services started by systemd, and tools like tmux. In this talk we examine the aspects of Linux sessions created in different ways as high level way to review both human-interactive and automated Linux workloads.
Show transcript [en]

hello i'm mike sample the cto at cmd today i'm going to talk about using linux sessions to monitor and review linux workloads most of the world's servers run linux and virtually all of them are maintained using a command line interface by members of the dev devops and secops teams this presents an insider threat risk someone could be accessing or exfiltrating customer data in the production environment modern approaches to infrastructure prohibit direct access to production environments by the dev and devops teams instead they have indirect access by submitting source code and configuration changes to a source code repository these submissions are peer-reviewed for correctness and security however even with this improvement linux sessions need to be monitored in

production the applications used by customers run in linux sessions and checkouts may still need command-line access in this talk i'm going to present the fundamental aspects of measuring linux workloads define linux sessions and introduce some session classifications that are valuable for security and monitoring tasks the linux workload load model comes directly from unix this is my copy of kernighan in pike which is published in 1984. the tech snippet shows the same fork and exact system calls used to launch programs on linux today the set sid system call which creates a new linux session was introduced in 1989. recording a workload should capture who did what when and where

when you log in with the command line to linux you are presented with the terminal view similar to what's shown on the right it echoes back the commands you type into it and it may show you the output of the programs those commands execute in this example i changed the directory to the source directory ran make minus s and then exited the session the minus s is for silent mode so make minus s is really the tip of the iceberg for what could be going on it could simply be building some software packages but it could also be installing malware exfiltrating data and taking your software supply chain now we'll examine the who what when and

where of linux workloads note that if you are recording these workloads in a central repository the server identity of where the actions took place must be captured also the linux process identifiers are reused and must be qualified with additional information to maintain their uniqueness in this central repository to the linux kernel user identity is presented by integer ids and even then the kernel owner really cares if you have uid 0 or non-zero uid0 is the root user or super user and has unlimited privileges names are associated with user and groups ids with etc password and etc group files as well as the plugable name service switch which could for example access a central user directory through ldap

there are two challenges today with the original traditional linux user model the first is that containerized workloads often run in user name spaces and use a different range of user ids when observed from outside that namespace the user ids do not match any entries in etc password the second challenge is that the traditional model of having unique linux username and group and credentials for each human user is being eroded by newer host access methods such as ssm and kubectl these tools use the same linux user for all accesses

where schedulers like kubernetes use a pool of linux servers to run various containerized workloads the scheduler usually doesn't care which server is used for containerized workloads so long as it has the cpu ram and other necessary resources available this presents a disconnect between monitoring the server workload and being able to identify the individual applications the same server could be running 25 instances of nginx in different containers and the way you find out that it was the staging instance of your signup application is by recording additional metadata such as the container image identity inversion and the kubernetes cluster namespace and pod names

when capturing the linux system clock in utc along with the kernel's boot id to assist in clock reset situations is a good start for capturing timestamps also capturing the event right or processing times at a centralized workload repository can assist in detecting detecting large clock skew situations as with user ids the new linux time based namespace makes it important to understand the ramifications of which time name space the system clock is being observed from what processes files network and inter-process communication processes are the embodiment of running programs a linux server may execute millions of programs per day these processes may touch files use the network communicate with other processes on the server and may change the configuration of the

system and hardware by capturing an accurate accurate version of these events one may see the possible paths of data flow in the system for example one could capture that a process had the etc shadow file open for reading at the same time as as it had a connection to a remote ip address open and therefore the contents of et cetera shadow may have flowed to that ip address

i lied a bit when i said terminal output cannot accurately capture a linux workload you can capture some of it if you run the ps command in your terminal to show all the running processes the venerable ps command was documented in carnegie and pike in 1984 as shown in the text snippet the terminal output on the right shows all the running processes on a very lightly loaded unix linux server or sorry ubuntu linux server even on a lightly loaded server there are a lot of processes and much information to be reviewed

psa jxf piped into awk is one of my favorite commands awk is also documented in kernighan and pike so i guess i like vintage programs here we use awk to reduce the noise by filtering out kernel threads this leaves the linux services started by the init process and my ssh session if you look a bit more closely the purple highlights show four related sessions starting at the bottom with my ssh session its ssh connection processing parent session above it the main ssh server listening for new connections above that and finally the init processes session we'll look at this a bit more detail in the next slide this is the ps and output from the previous slide with some sessions

removed so that only those related to my ss ssh login are shown starting at the top highlighted session we can see the init process running s bin init which is a sim link to system d this process is started by the kernel when it boots and its job is to start up all the configured services one of those servers services is sshd which listens on tcp port 22 for incoming ssh sessions sshd is the second session down and we can see that its parent pid is pin one and that the init that of the init process therefore its parent session is in its session sid one the third session down is comprised of two processes created by

sshd to handle my specific ssh connection and finally the bottom session is my login shell that was created by its parent session it has three processes in it two of which are for the ps and mock programs note the process group id or pgid of ps and awk is the same this is important for signaling and job control special sessions there are some special sessions the unit session as discussed on the previous slide is started when the system boots it is the ancestor of every other session on the system host access sessions permit command line access to the server over the network and by directly attached terminals host sessions can be identified by their executable name such as sshd

individual host access sessions such as an ssh login appear as children sessions of the host access session they getty which handles terminal and console logins is an exception to this and reuses a getty's session for the login shell when this user logs out the linux session is ended and a new one is created for a new a getty

special sessions t-mux and screen t-mix and screen are special programs that protect command line sessions from being automatically ended if you log out or a network issue breaks your connection to the server devops folks will often use these project complex administration and troubleshooting tasks and to allow these sessions to be accessed by other users or later in the week tmux and screen sessions will re-parent their sessions to be children of the init session to accommodate the loss of their the host access session instance in which they were first created special session features a controlling tty may be associated with a linux session their presence suggests that the associated session's purpose is for interactive command used fine use by a human

finally if we take into account the special sessions like init host access tmux and screen as well as the presence of a controlling tty we can divide sessions into some useful classifications first host access enabling sessions identified by their executable name should be carefully enumerated for security purposes these are the doors into your linux house note that even if these sessions match aspects of the four quadrants we consider host access students to be unique and not part of those coordinates host access sessions include sshd amazon ssm agent and the kublet plus cryo as well as a getty in the case of a getty it will transition from a host accession to the lower left quadrant when it execs

login second the upper left quadrant representing instances of host access including an associated controlling tty should be carefully scrutinized for insider threat purposes there is typically a human associated with them the upper right quadrant represents some non-interactive remote access such as using ssh to run a command instead of creating a login shell often this type of session is used by automation tools such as ansible but humans may use it as well ideally separate linux usernames are used to discriminate between automation and humans to permit more careful review of human access the bottom left quadrant generally represents shells running inside tmax screen and from terminal access any any other matches should be examined as well the bottom right quadrant represents

services started by init or those that have re-parented themselves to the internet session like the non-interactive main tmux and screen processes do the services portion is important to monitor as this is typically how applications such as web servers run and where application security exploits may provide a means for attackers to gain access to the system these are the windows to your linux house these classifications help tailor the type of monitoring and indicators of compromise to watch for they can also help in selecting which workloads to record depending on your threat model today we covered a model for how one can accurately describe linux workloads using users groups timestamps host names and container metadata processes file access network access and

an process communication we defined the linux session as a group of related processes initiated by the set sid system call by the session leader process and finally we classified linux sessions to focus review and monitoring into the following services that can be web applications with application security concerns human access for interactive command line use including tmux and screen shells non-interactive host access by humans and automation and the sessions that provide host access itself which should be enabled according to policy thank you for following along i hope this has given you a clear understanding of linux workloads and sessions you