← All talks

BSidesSF 2020 - Mapping the Connections Inside Russia’s APT Ecosystem (Ari Eitan)

BSidesSF · 202016:38204 viewsPublished 2020-03Watch on YouTube ↗
Speakers
Tags
Mentioned in this talk
Tools used
About this talk
Ari Eitan - The Red Square: Mapping the Connections Inside Russia’s APT Ecosystem This talk will detail the stages involved in the research study of the analysis of the Russian APT ecosystem. It will present two open-source tools which can be used by the infosec community to further investigate Russian-related cyber attacks.
Show transcript [en]

thank you very much for attending this session this talk titled the red square mapping the connections inside Russia's apt ecosystem let me give you a short introduction about this talk so in the last 15 years many Russian operations and more families were publicly exposed by different security vendors and intelligence organizations now those publications focused on specific Russian actor or operation but the bigger picture remains unclear which is the Russian apt ecosystem as a whole we wanted to get a better understanding of how this ecosystem looks like in terms of the connections between the different components and that is why we conducted this research I'm really happy to be here today and to show the results and

I'll also present several open source tools that we built during our research and now can be used by you guys the community for further investigation but before we start I would like to introduce myself my name is alia tan and I'm the VP of research at integer I usually present this talk with a Thai coin a security researcher researcher from checkpoint now this research is a result of a collaboration made by checkpoint research group and integers research team which was led by a Thai here and Raymond bethought from our side and in total we worked on this research for a few months so let's start as I said we wanted to detect connections between different Russian entities right

such as a pair of malicious samples whole families and even between Russian actors themselves but what is the connection a connection can be shared module tool or even a specific implementation of a function our research was very wide and we weren't limited to one type of connection we had many questions in mind that we wanted to answer for example all the different Russian government entities working alone or are they sharing code and techniques with each other and if so what artifacts libraries and code are more likely to be shared between two different actors or two different tools can we support a known connection between two different actors but from a technical perspective and not from the intelligence

perspective and and so on and so forth we ready we really had many questions in mind all those questions led us to start reading public information that was shared by other vendors researchers and even government's regarding Russian attacks there is a lot of information available out there so actually that was the first step we took and it was a very important one so I would like to thank all the amazing researchers who published regarding Russian attacks in the past years now after days of reading background materials and publications it was clear to us how we should proceed towards this goal of mapping the Russian ecosystem now to put it simply we split the research into four steps the first

step is to collect samples that we know were attributed to Russia then we need to classify those samples which may sound easy but it was a tricky step and I'll elaborate on that next we want to find the code similarities the connections between the different samples and lastly analyze the interesting and relevant connections so let's start as I said we first stage was to collect samples we extracted IOC s from all the reporter we read and then we had to grab the actual samples the binaries for many sources overall we had approximately 2,000 unique samples to work with now the next step was to classify those samples and although it may sound easy right since the name of

the attack appears in the reporter we grabbed the aisle seats from classifying the samples turned out to be one of the most complicated parts of this research all right and I'll explain why first of all there is no naming convention for Mauer and threat actor in the InfoSec industry every mouth family and every actor have more than one name given to them by different vendors some vendors will be used different names to describe the same family and other Mauer families simply do not have a clear name so those issues and more made us face one of the most painful drawbacks of classification and required us to be very careful when we classify a specific piece of malware

to a family or an actor but actually naming convention was not the only problem and sometimes we discovered problems with the IOC all right there were two generic or even wrong and since we needed to know exactly what every sample does we we had to dive in deeper so we did and we build this template for pieces of information that we wanted to collect for every sample of Russian model that we had first of all actor which actor is known all likely to have written the smaller for example Tula gray energy surface II and and so on next family what is the common family name that is associated with the smaller module many more files

are built in a modular way in which the Mauer can load a specific module embedded in it or download the module from a situ server and we wanted to know whether the sample we have is a keylogger module or communication module an injection module or anything else and lastly version now some our files have a clear version Stamper bedded in it and when possible we wanted to be able to be able to differentiate between earlier and recent versions now at this point we classified all our samples into 60 families and 200 different modules and then we were ready to move on to the third part which is finding the code similarities for that we used integers

genetic malware analysis technology which basically means that for every sample of malware that we collected we automatically disassembled and they sectored each binary into thousands of small pieces of assembly we refer them as genes and then we search in which other malware samples Russian samples we have seen those genes in the past using this genome database now this database contains genes which are the assembly fragments from both malware files including Russian files but also from legitimate software which helps us to focus only on the unique and malicious code that was shared between the different Russian samples right without wasting time on shared library code like open SSL for example so eventually we were able to connect different Russian

samples based on their pure assembly code and every connection means that we've seen those genes only between those two Russian samples and not in any other malware which is now Russian right or or a trusted software so it's really powerful connecting the samples based on that but actually it was a bit more complex that some of the south of the files were packed and samples were packed in order probably to stay evasive so we had to unpack them automatically statically or dynamically and the output of this process was a list with pairs of samples that pair pairs of Russian samples that share code with each other so the next step was to make this list to a

craft so now we had a decent amount of connection to analyze and we came upon this open source software called Gaffey which is how many of you are familiar with getting yeah okay so Jeff is a platform to visualize and analyze graphs so we downloaded it and loaded our data and then we got this as a result which is obviously not practical there's nothing we can do with it so we applied some graph layout algorithms to the graph in order to make it usable as you can see here and that was the result we could start spotting some clusters and even connections which was nice but we're still missing the context so we added labels and colors to the graph and

now we were ready to start analyzing the clusters and connections the colors here represent the different Russian actors so it's very easy to find the connections in a visual way now there are three types of connections on this graph the first one which is the most obvious one for us is when two samples are of the same actor and Malware family right the second type of connection is when two samples are of the same actor but of a different family and the third type of connection which is obviously is the most interesting one for us is when two samples share code and belongs to different family and different actor those are the type of connections that

we're looking for now before I show the results of this analysis we did let me introduce you with our first open source tool our present today so we took this huge map from Gaffey and we turned it into a web-based map where you can interact with it and conduct your own analysis let's have a quick demo of it first of all it's live apt echoes ecosystem calm so you can try but I already prepared it and this is how it looks like and in this in this graph every node is am our family and not a sample that's why you see less nodes than the previous graph but you can zoom in and start spotting the connections

and for this node for example I can immediately see the name of it the actor the other connections that it has to other samples and once I click on that I get the sidebar with some more and Meishan regarding this specific mouthwash remedy so I can see some synonyms because remember there is no standard for naming convention in the industry a short description of them our family some references that we recommend to read and of course links to the platform where you can conduct your own analysis and see the generic report of the specific sample and connections now the map the map and the road data are both open source so you can use it you

can conduct your own analysis and they're all available on our github so please try it and give us your feedback and yeah that's about this map so now we have this huge map of connections and we were ready to start analyzing them and for us as reverse engineers it's where the fun part begins so we analyzed and analyzed many connections and we really hope to find some groundbreaking findings some new and undetected connections that nobody ever mentioned before but unfortunately I don't need a boyfriend to our disappointment we couldn't find any cross actor connection and we couldn't find that two different actors operating under the same Russian umbrella our sharing code with each other make it

stop we did find many indications that two different families operating under the same actor are sharing code but not between the different actors so at first we were quite disappointed by this result right because we really wanted to find some new and and undetected connection but then we realized that the fact that no code was shared between the different Russian actors is an interesting conclusion by itself right and then we started to ask ourselves why why the different Russian actors are not sharing code with each other and while we could come up with some theories the truth is that we really don't know and of course we don't have anyone to ask but we do have some

theories the first theory is that Russia is well aware of the importance of operational security OPSEC now let's have a quick pros and cons of the decision not to share code in terms of object so Russia knows that in case of sharing code between two different actors if if one actor would get caught it will put the other operations at risk right and this is obviously something that would like to avoid and by separating the code base they can make sure it won't happen but of course it comes with a cost and a very expensive one because if this theory is true it means that Russia is willing to invest time efforts and man-hours in having different teams of

Malwa developers writing the exact same functionality over and over again instead of just sharing code between the different tools now the second theory that we have is completely different and in this theory Russian organizations do not share code due to internal politics this could happen but again we don't know that for sure and I just wanted to put it on the table as an option

so unfortunately we couldn't find any unknown connection an undocumented connection but we still analyzed many samples and many connections and we wanted to do something with it so we wrote another tool by having access to thousands of samples we were able to tell which genes which are the assembly fragments per sample which genes are the most popular per family so we wrote the apt detector which basically what we did is we grabbed the most popular genes per family and then we automatically generated yahwah rules based on those genes we merged all the yellow rules from all the different Russian families into one yellow rule set and we rubbed it with Python and executable and now

it's ready to use you can use it to scan your systems and to make sure that you're not infected this tool by the way is there are cross-platform so you can execute it both on Linux Mac or Windows and once again all the both the Yahoo's and of course the tool itself are all open source and available on our hit github so you can take the yahoos for example and deploy them in any other system that you may have or conduct your own analysis using those groups so this is the second tool and that's basically it thank you very much I'll be around if you have any further question and I hope you enjoy [Applause]