BSIDESLV 2018 - Ground Truth - Day One

BSides Las Vegas1:25:34230 viewsPublished 2018-08Watch on YouTube ↗

Mentioned in this talk

Tools used

Service

Frameworks

Languages

Show transcript [en]

luckily it's time for the shameless plug we just released malware score for macros into virustotal I feel comfortable doing a shameless plug because this has been the last six months of my life I was the primary developer on it and this is a machine learning based maliciousness detector that looks at the macro text of a document and actually speed grafters original purpose was finding true positives for this so we could improve the score here so go check that out when you get chance but we're gonna use that as the ground truth of labels so next we're gonna build a dumb classifier and I'm gonna say it's a dumb classifier because in our operation we haven't

really built a classifier for this I built this for this talk we're still in the process of refining more information we can get for Speed Graphic extending it to different file types all that sort of stuff so we haven't really gotten to the classifier thing but I want to show you how when you have rich data really good data and decent labels then you can build something very robust very easily and so what we're gonna do is just take all that data that you saw in that JSON blob just smash it into a vector we're not going to do any analysis on it anything like that we're just gonna put the raw numbers in there so we take the

color centroids put those in there we take a blur and blank vectors so we put those in there the Yolo Keys we rip out the percentage and set them in the appropriate index level put them in there and we you know grab some text keys drop them in there this is one hot encoded if it's there that values one if it's not the value zero so if you had just enabled content and text you would have a vector of one zero zero because it didn't have enable macros and it didn't have macros if you had enable macros in the text and you'd have 0 1 1 because that enable macros and also macros because you know it had enable

macros so we have that and we just put them all in a vector I don't care so one thing you do want to care about though we're later we're gonna be using a random forest classifier so this doesn't really matter that's a quirk of the classifier it's a tree based decision boundary based classifier but in general good data hygiene to always do is to normalize your vector between zero and one you want to make sure that just because one of your features has a large scale zero to 9,000 maybe you don't want that to over way another feature that scaled zero to five just because it has big numbers it might not have big importance so we're going to normalize

this real quick now convert it to numpy vectors if you don't use numpy and you're doing data science you're probably doing something wrong so hughes back and then we're gonna build a classifier and and for this we're just gonna make a basic random forest classifier this i just copy and pasted from sublime text so this is more complicated that needs to be because this could be one line you just import multi class or from import like random forest classifier from ensembl and then you know say number of estimators and max depth and you're pretty much good you have a classifier that you can train now and then finally we want to train it one important part of training and this

should be hit home all the time is you you don't test on the data you train with this would be like if you were you were getting ready to come to Vegas and you were practicing blackjack and you look at your cards and then you look at the dealer's cards real quick and then you make your bet that might work at home you're gonna do great at home you come here you get kicked out of a casino um here you you want to you want to train on a certain set of data and test on another set of data but you want to be able to to do this like at scale so what we do is what's called a k-fold

cross-validation and this is training on a portion of data testing on another portion of the data and repeating with a different train and test set so for this instance we're gonna do say we do 10-fold cross-validation we're gonna run this ten times we're going to take a test set size of ten percent of the data we're gonna train on ninety percent evaluate on the 10 percent record all the predictions that we made on it and the the valid score the true positive scores for the or the true scores of that and then we're going to repeat that but we're going to move our window and we're going test on the next 10% of the data and

train on the rest of the 90% then we keep on doing this and aggregate all of these testing testing points of information and then we can create a confusion matrix we can create false positive rates to positive rates all that sort of stuff and we can create that ROC curve that we did before so let's take a look at this and again you know you can plot in one line right there you might even miss it because we're that good we have a receiver operating operating characteristic of 0.98 which is it was really good for a dumb classifier that's a testament to good data going in we can look at the the false positive rate and the true

positive rate we we have a sort of class imbalance of malicious versus benign in our training set that's sort of okay this isn't too bad if you have like a hundred benign samples and 20,000 malicious samples you got a problem this one I'd want to adjust but for the sake of this it's fine and then we can look at our false positives we have some samples that we label as as malicious but are actually based on our labeling system are considered benign and we have false negatives things that we know label benign that based on a labeling system we label malicious so it's always good practice to dive into those you know look at what you miss and then you

can tweak and adjust you how you perform change your algorithm change your data change your labels and then retrain it and just minimize those two those two functions so what thing that was very interesting is 70 percent of the false positives are actually neutered malware so we see something like this this is one of those samples and you know that looks very clean and all but it is fishy it's asking you to enable content to view a decrypted message when you look at it and in virustotal you can see like the actual macro text or if you won't have a parser that can rip it out or something you'll see that macro code was removed

by semantic - sorry like thanks semantics you that's good and all but you messed up my classifier and also the majority of the false negatives are malicious but don't really show it so we're doing a computer vision application some things aren't applicable for computer vision this has to be part of a layered mechanism you can't just depend on one thing that's usually a tenant all the time so some of these just don't look that that summer in characters that our OCR has trouble with we didn't really do much configuration on the OCR we could revisit this when we decide to but a lot of asian characters just don't show up for us I think we were just using a

Latin character set or something and some don't show it on the first page we have this nice resume by Joe Smith Joe if you're here come talk to me after class this is a security specialist but if you dig into the macro there that doesn't look like it's doing anything good so while it looks like a normal resume our computer vision application isn't going to detect it because I wouldn't want it to detect that because it doesn't look like it's bad we're trying to find stuff that looks bad luckily our other malware score caught this one so we're all good and so that's that's basically it we're at about 40 minutes right now we covered the background fishing's of

problem current strategies we talked about blazar and speed Grapher and here's a set of links if you want to take a picture of if you have any questions I think we've got a microphone that we can pass around and yeah that's about it

thanks a lot onion so if there are any questions yeah you're ROC score was phenomenal 0.98 it almost makes me a little skeptical so what did you do to defend against overfitting yes overfitting is definitely a problem for this since we're using cross-validation that's a decent decent measurement technique against overfitting there there is and I will say there's an intrinsic problem with this data there are repeats throughout and so we might be training on something that looks super similar to something we're testing on that's a huge caveat and it's a very good point this is something that we want to try to when we are going to build a real classifier this for this we

want to sanitize and deduplicate all this but that being said a ROC curve of over nine eight in this realm is is not unheard of our macro detector is is over ninety nine it's it's almost at four nines we're trying to get that fourth nine any more questions

yes so on the on the blazer project did you guys try to use different font types because I can see how different font types would change the look of the URL that that is a very good question we didn't that sort of on extended research if we want to to really develop this and part of the product we used Arial as the font base and with the repo we we asked you just like you know download Arial and install that one because that's what it's worked worked on but yeah using different fonts it it should be trivial just to add a different font and expand your data set duplicate your data set based on that

you might lose some accuracy but it should work the exact same way but this was a research project that we did we hadn't we haven't put this in the product yet so we just used one font for this one that is a very good question so for your blade for blazer let's say you're trying to visit goggle comm that you actually want to go to goggle calm but you know there any way to really make sure that it won't detect or sort of train so it won't flag legitimate but similar looking websites to like Google yet so that that's a very good problem one way that you want to do this is expand your training data set to include

things that that are somewhere you are gonna have you're never gonna get this method to work 100% I'll say it right there we also use this for for a file name detection if you're running a bunch of processes we had like SPC host if you replace Oh with zero that's malicious but you have stuff like Java and Java see those look like they should be spoofed with each other but they're both legitimate processes so when you have small character strings like that you do have a lower rate and you'll notice that our are false positive if we if you let me go back all the way here let's see our our ROC curve sort of bottoms out at

one point and this is exactly the problem that we're seeing there one second yeah you can see at the top we're not getting we're getting pretty close but it does bottomed out a little bit we have when we're doing the one for file type it's even more pronounced this one's a little bit hard to see but file type characters that are legitimate but look like boots with each other so that might be a white listing feature that you add to that like okay you can you go out goggle that's fine yeah you need more questions well that's that's grant so thanks a lot Daniel [Applause] [Music]

[Music]

[Applause]

[Music]

[Music] [Applause] [Music]

[Music]

[Music] [Music]

[Music]

[Music] [Music] [Applause] [Music] [Applause]

[Music] [Applause]

[Music]

[Music] [Music]

[Music]

[Music] so thanks for coming this talk is about detecting social engineering attacks using natural language processing techniques so we process the sentences and see if they're suspicious as they come in that's basically what we're gonna do but I just want to first acknowledge my co-authors so you can see toward the end of this list is me and then Marcel who's standing right here but the first people there who did all the coding are my students from last year so they needed credit for this because they actually coded basically the bulk of this although they are not here giving the talk so some introductions this is Ian so he's a professor in computer science at the university of california in

irvine and he does research in hardware verification and security and a lot of natural dyeing or language processing stuff and this is Marcel Marcel Carlson he's principal consultant at loot core he does red teaming consulting security research in general and he doesn't like to do hardware hacking and social engineering is his favorite thing yes so quickly we'll do a little primer on social engineering I'm sure you guys are part of it but will this quickly go through so we have a definition here that's pretty good so what social engineering you can sort of explain it as any act that may influence a person to take any action that may be in their best interest or

not right because remember not all social engineering is evil you might have a doctor or something who will actually social engineer into doing something that's good for you even though you might not know that's the case so again social engineering is something new that's been around for a very long time you have con women con artists men and your kids are very good social engineers they want the ice cream or something they know how to do that very good it's a very fascinating but also complex topic you will have time to go into all the details but you have things like body language micro expressions framing pretexting elicitation and stuff like that right so there's a lot of things that goes into

social engineering you can feel more questions we can talk about that afterwards but we don't have time to go into details about that let's talk about about organizations and companies right they're really I would say exposed to sourcing in there we should think about modern business or organization it's a complex ecosystem of technology people processes and of course humans working together and usually I would say the sort of threat the social engineering threat is usually underestimated by most companies and the awareness is usually quite low I would say it's sort of become better that the last few years some companies do some some pen testing with spear fishing and stuff like that but normally it's pretty bad so if you

do any sort of spear fishing associate that neering attacks it will usually work and that's why it's still being used today in most sort of attacks right and another contributing factors of course if you think about sort of solutions in a company even business processes often there's a burden on the individual user to take sort of security decisions right and that will be of course exploited by social engineers and that leads us to sort of trust relationships if you think about your typical organization they will of course engage service providers trusted business partners and often you will have someone running your physical security like facility management and stuff like that and of course this will

expose a lot of trust relationships that again a social engineer will target attack and exploit and to identify this sort of translation ships we are social engineers we'll use of course open-source intelligence gathering so most companies and their employees will post all kinds of information on social media platforms so what you can do is harvest all this information through the metadata and we automate that so we can hard a lot of stuff quickly and of course there's data dumps data leaks out there with passwords usernames so you can find out the user sort of format and stuff like that when we need to do our fishing and all the sort of technology infrastructure information like IP

addresses all sort of stuff we'll gather all that information to build up processes and profiles of companies and people that will be use for our attacks and there's some pretty cool new tools that are coming out this if you look at sort of the topics at this conference and blackcats and stuff you'll see a lot of machine learning cognitive science stuff and this is something called Microsoft video indexer and it's free you can just sign up and then you can actually upload videos videos right here's the fake Obama video from BuzzFeed you upload that and we'll do some pretty cool analysis so we'll extra extract metadata so we will actually do face your face your

recognition and find all the people in the video guess the timeline it will do transcript of all the sort of spoken words and extra extract keywords and you can see here we'll even translate the different languages we'll do OCR to take out stuff that's being text that's sort of in the video and pretty cool stuff right so you can see it's moving in the sort of direction that we can even use video now to extract metadata for our sort of profiling a quick word about methodology like we talked about trust relationships it's all about gaining trusts we will extract information influence the targets keep doing that couple of rounds and then eventually we will have a compromise and talking about

sort of s social engineering attacks right it's nice to blend social engineering attacks so you can do it remotely be our sort of email messaging SMS voice these sort of remote attack vectors or locally will you sort of show up in person and they've could you can of course mix and match these sort of things and attackers are lazy and they like to work effectively so of course path of least resistance basic stuff works and you don't really need zero days to do the sort of attacks here's an example of typical phishing email this is from the DNC attack there's a guy called Podesta John Podesta right so someone fish them pretty good they use some basic sort of

obfuscation to bypass the filtering in the email solution and sort of hiding the target or the URL there when you click on change passwords you can see here you shouldn't this shouldn't work right but it works every time very much like to don't pay attention you can see here it's not a secure URL it's actually dot seek a domain as you can see there so you should will click on up but people do usually here they've used his own sort of picture from his Google+ page to make it look nice so he would actually put his credentials in there and be phished so you can see it's not very sophisticated stuff but this actually works that's the scary thing right so

talking about what's coming more soon here there's something called deep fakes I'm not sure you guys are familiar with this so you can take a video now and use some pretty basic tools on your laptop and take for instance a set of video videos or pictures so you can put if you take a porno porno home video put some celebrities face on there or maybe your girlfriend that you didn't like that's actually moving you really fast in terms of the development doesn't have to be porno it could be yeah doesn't have to be but I'm saying it's always porn of that sort of drives devolution right so that's kind of funny that that's we're sort of starting always but I'm saying

it's I mean we will see this you know for social engineering attacks as well we can study of course you can alter video like don't borrow my video you can put someone else's face on a video and that's pretty scary stuff which brings us to stuff like adversarial games right that's a concept we have sort of two counterparts one will produce it of fake stuff and and sort of submit it to someone who will judge whether it's real or not here this example is sort of dollar bills and this sort of Calton mouse game so of course the fake become better to sort of trick the investigator who would look at the sort of fake stuff and the investigator will

learn how to spot the fakes and if we introduce machine learning and sort of AI into the mix we have something called denature adversarial networks or gas so this is sort of the same idea about here we introduced something called back propagation so again we have a generator sort of creating fake sort of content based on training samples submitting it to discriminator that will sort of look at it to see whether it's real or not comparing to real samples and then sort of the the determination data will actually be sort of fed back to the discriminator and the generator which actually become better so that eventually the generator will actually produce content that the discriminator

will not be able to tell whether it's a fake or not and that's actually interesting from a social engineering perspective because you should do this to sort of audio so if you have a bunch of audio snippets we can actually create a model where we generate sort of fake audio that will be very close to the real thing and again there's research being done done in this area but it's moving pretty fast right so it's very interesting area for social engineers so and yeah so it's looking pretty bad right a lot of exposure a lot of tax is working nice so what can we do to sort of defend ourselves thank you so I'll start

talking about the tool so what it is is that yellow box right there it says social engineering detection that's what we provide so it takes text that some potential attacker has spoken and it analyzes the text to see if it is suspicious or not right and if it is then it reports it back so you could connect it to an email client or a texting app or something like that pretty easily you know because authority and text form or you could do it if you had speech speech to text you could use any audio/video vais you can put it on a phone right if you did speech to text first and just feed it the text and then

it would output some kind of alert message of course you know right now well it's going to do is print text to the screen scam detected or something like that but you could easily put any kind of warning and so so yeah we're analyzing each one of these sentences is spoken or or sent and this is one app just to differentiate this from a lot of previous work in social engineering stuff most of it is fishing just phishing emails you know but what about if you do it in person what about if you do it over the phone right then you only have the content to look at you don't have metadata and you know URLs and all this

and so this is what we're talking about is effective against long as you have content doesn't matter what form you know how what the vector was you could use our approach okay so so the basic idea is that in order to detect the social engineering attacks so social enduring attacks are complicated but somewhere in that attack the attacker has to ask some inappropriate question so meaning asking a question whose answer is private right like you know what's your secure so security whatever or they got to give you an inappropriate command tell you to do something you shouldn't do like plate click on the link or something right so so we're assuming that somewhere in the

social engineering attack they got to do one of these two things right that so we'll call I'll call it the punchline okay at the end of their social it maybe they're warming you up flattering you whatever make you feel good but at the end they got to be like okay what's your social security number right they have to do one of these two things so ask a question or issue a command that's inappropriate so that's what we're trying to detect so now what I'm gonna do is so I have a demo we have a demo right in case I failed I have a little mini video of a demo right now that we'll run through and I'll just sort of

talk through else noxious yeah it worked five minutes ago but you know like 10 minutes ago okay so let me start this guy yeah so they say I wanna start running it so basically all you do with all I'm gonna do is just type in sentences and it'll tell me if it's a scam or not okay so it is slowly starting okay now I'm gonna start running it okay it's called temp my students named it they're laying but don't worry about that okay answer sense so you got a type of sense in so I'm just gonna do some random sentence nice weather we're having right and it's gonna say that's fine so hit enter it's

gonna say uh first one takes a long time it's usually quicker normal sentence so I'll highlight that so okay so that worked right so now I'm gonna type in something that's more suspicious I forget what it is oh yeah a command give me money okay so that's suspicious if somebody's like give me money so that we detect scam detective the notice give and money right we find the verb and the direct object and that's sort of the idea with the commands right the verb direct object that pair is suspicious now we're not just searching for words the words given money we look at how they use in the sentence so here this is give me a poodle for money

that's a silly sentence but I bring it up because it's innocent and it has given money in it okay but this thing we will not detect as being malicious because we can say we could see that given money are not used so given total pool is a direct object so this is an innocent sentence so just because has the word giving it there where money in it doesn't mean it's malicious so we're not just looking for words we look for how they're using the sentence then I got some questions yeah so those two are commands use a question so what is your password that's clearly bad what I spelled it wrong so that's clearly bad so hit enter and it's

gonna tell me it detects it as a question and in a second it's gonna say scam detective okay so and if you you can ask you some question that's whose answer is not malicious oh it's not suspicious and it won't detect it so what is your age okay so I didn't consider that to be suspicious so I didn't put that then we have a database of things answers which are suspicious and that's not in there and so it's a normal sentence so that's basically what it's gonna do I'll have them if I have time i we should have time I'll show a longer version of that demo but that's basically what it's gonna do okay

so system structure now this is big and I'm not even going to talk about the whole thing but the idea is it starts with text that's spoken by some put some attacker and in the end it gives you malicious sentences over on the right meaning sentences that looks suspicious and if there's more than one and like say you're scanning an email and there's a sentence one or more senses it looks suspicious then we say this is a phishing email or this is or if you're talking and one of the sentences that this person says is malicious then we say okay they're scamming you warning you know so the main thing I'm going to talk about over here is um okay so first

thing we have the sentence processing people we put periods in the sense so people don't necessarily use proper punctuation right so we have to put that in automatically sometimes but say you got these sentences we got it determine is it a question or is it a command right and maybe it's neither in which case we ignore it but if it's a question or command then we if the command we do command analysis to see if it's a malicious command it was a question we do question analysis to see if it's a malicious question you know private question so these are the things I'm gonna talk about like how we do those things they tell if it's a question or command and

then once we know it's a question command how do we analyze questions how do we analyze commands so okay so detective so how do you tell if the question of command so this and a bunch of the things we do are based on parsing okay so we take the sentence and we use a syntactic parser to basically which tells you the the grammatical structure of the sentence you know the parts of speech reach word now a phrase verb phrase all that a lot of stuff that you like forgotten sixth grade which I forgot I have to learn again there parses it automatically do this and we used Stanford parse is a popular one it works really well so it was gives you a

parse tree of the sentence that shows you the different parts of speech and we we use that we look at the parse tree to look at the structure the sense and we basically find patterns in there that we think is officious so so here's an example here's a sentence can I eat okay that's this parse tree over there and these just to say what these tags are asses for sentence noun phrase verb phrase verb modal anyway it's a bunch of tags and you get a tree like this it shows you the internal structure of the sentence now this is a question and notice a compare can I eat I can eat the way we made it a question like one way

to make questions in English is you swap the subject and the verb in the modal right so can I instead of I can actually the same in French and a bunch of other languages but that's one way you can detect the question so if you see actually the the parser does this for us this s inv tag that stands for subject invert if you see it in the sentence you know that this is a yes/no question where they inverted this subject in the modal okay so this is given to us for free we just get the parse tree using another tool we say always that s inv tag in there yes it's a question now

also sq tags because there are other types of questions this is a close question yes no they're also open questions what is this what is that right and we can detect those with actually there's a series of slides I left out just cuz there's a lot of detail in it but I have a white paper I'll be happy to give anybody who wants to read it but it gives all the detail but they're a bunch of tags we can we look for and we could say look if we see these tags inside there we know it's a question so then once we know it's a question we do command detection sorry change that around question if we know it's a

question so we do say that wrong we have to check us a question and we have to check it to the command it's not a question then we had chickens of command so the way we check is to command there's several different ways to write commands the common way is just to have a verb without a subject in front of it okay so instead so go home right I didn't say he he goes home I said go home just no subject to the left of it you know stop right there you just put the verb right in the front right and there's no subject in the sentence at all so if you see a sense like that with no verbs that

no noun phrase to the left of it then you say okay that's a that's a command it's one type of command is a common type there are other ones like more polite ones so often you'll say please go home all right especially social engineers often do that they want to soften it they don't want to seem like an order so they say it a nicer way another thing they do is suggest you could go home you should go home right it would be good if you went home right these type of soft suggestions we detect those two could should their modal verbs and we basically we look for a pad in the tree where you see like I you then a modal

verb and then the sentence the you know the verb and the rest of the sentence and then we can say that's a command too so there are several different types of commands that we detect by looking at Pattison the parse tree so so we can detectives a question or detectives command now once we know if the question or command then we got to analyze the question or command to see if it's malicious so with the question analysis how we analyze the question so the basic idea is to determine if the answer to the question is private okay so if it's private the answer is private then sound an alarm if it's not private then no

problem right so if you say where is the bathroom answer is not private so no problem but you say what's your social security number the answer is private so used to sound alarm sounding alarm right so we wanted to tell if the answer to the question is private I'm not going too fast am I no okay cool so if we want to tell us private so what do we do we basically use research and question answering systems so question answering research is it's an area in an active area research a lot of people do a lot of it was a lot of industrial working it to where the goal is you ask it a question it gives you an answer and it

either looks through a massive database of facts to find the answer or it looks through the internet itself to find the answer okay just like what Google does except better you know it understands the question well enough to be able to find the answer okay and other and I don't develop question-answering tools but they exist okay they're pretty bad okay because it's a hard bed but there is a hard problem okay taking a natural needs questioning providing an answer is really hard okay so when I say their bed they have a low they're not correct most of the time okay but that's okay you know they start in the field you know so on but I like for

instance the tool that we're using it's correct 44% of the time on their tests examples on their test questions which is low right but we modify it so that it's good and it's good for us okay because we I'll say in this in a second oh but anyway why so you look through some massive database and find the answer okay so so we're using the type where you're looking through a massive database and the idea is you take a question in English you got to generate a query so say I'm using a sequel database right and make some query and and then I search use the query to search the database and I find the

answer okay that's how question answering tool that's how a lot of question answering tools work so what we do is we basically say well we do one change if we do two existing to an existing question answering tool is we say look instead of having this database it's like 15 million facts we only put in the private facts okay only the private facts then if you find it in there you know is private and this is a malicious question if you don't find it then it's innocent because see we don't need the answer to the question right all we need to know is this it private or not right so if so we just so instead

of 15 million facts we'll have like a hundred facts that are the private facts you know whatever in fact you want in there and we'll put him in the date of it way I'll actually put the fact - I should mention we put the information about the fact but not the fact itself so say I said President Obama's age was private right so I put it in there put his private tag and I wouldn't put the age in there because somebody notice me that would be stupid if you just have a database full of private facts if somebody got you know so we don't need the actual fact right we just need to be able to say if you did a query search on

it it would find this entry and then that's enough right we don't need the actual fact we just we just say look we found it in there it means it's private so okay so the the question answering system we use is called parallax here's their reference their paper reference and it's open source and we just downloaded and use it use it and this is a sort of three entries out of their table so they have a massive sequel table of like over 15 million things that they culled from Wikipedia automatically and it's got basically I'll roll a everything's a triple-a relation and two arguments so official language of Hong Kong Cantonese plural for bacteria is bacterium and so

on okay they get this massive database of facts and ok so that's what they have and what they do is they take the English question like this they generate a query and then they search the database for the to find the answer right and they were correct 44% of time you had a question go back I was too fast now yeah yeah oh ok oh yeah just want to see it ok yeah sure if you want any more information I'm happy to give you the whole paper like anytime like after this I'll just if you have a thumb drive anyway so yes so there's the idea so they and we didn't do this they do

this right they make the query now one thing is since this problem of question answering is really hard is they don't generally they don't make one query ok they take the English and they make multiple queries in fact like 3040 queries because it's basically guessing at what the right query is it doesn't know what the right query is so a try basically they it tries lots of different possibilities so it doesn't generate one query it generates like 3040 ok for one question then it ranks them and they have a ranking algorithm and then the one at the top of the ranking is the one they use and then they search and if they find the right

answer if they find the answer in there that's what they give now this is example actually this is one actual example using their tool where it comes up with these two queries that comes up with a lot more but I'm only shown the top - the top one query is answers Steve Jobs because it's misunderstanding the question right the second one is actually giving me given the date is given the right answer so it was for this question they answer it incorrectly because they come up with all these queries they choose the top one but their ranking was messed up and they got the wrong one but nope the answer usually the correct query is

somewhere in the top group of queries but they're only choosing the one top okay so so we change that okay so one thing we do is like I said we stripped the database down to just the private facts like information about all the private facts okay the other thing is that we use okay so we use the top 15 queries okay so we don't just take like the top one query find the answer we take out top fifteen query search them all in the database and if it's fat if the answer is found in there then we say it is private now what that'll do is it'll increase the rate of true positives right because if the right

queries the second or third rank we'll find it so we end up getting correct 99% of the time okay but it may increase the rate rate of false positives because you know maybe you know maybe this thing isn't actually private but one of the other fits you know 14 are you know it has an answer which is private and that's possible but extremely rare okay because remember we're like actually there's a database of facts in the world it's like actually this tool uses like 15 million facts we have like a hundred private ones the odds that if you pull 15 random facts out of there that they're actually at code just accidentally in the private set is

really slim cuz you're talking like 100 out of like 15 million so the chance that increases false positives is really slim so that hardly ever happens okay so even though we look for all 15 the other 14 are probably not going to be private you get product because they're just sort of randomly spread out across a database so that's what we do with question answering yeah that's basically how we deal with how we analyze a question okay so in addition analyze some questions we have to analyze the commands to which by the way are much more plentiful like in the data that we were looking at you see a lot more malicious commands like click this

link or whatever rather than malicious questions and now this doesn't mean that you know if you do it we're using phishing emails in person I would expect a different answer but I would expect something different but anyway command analysis so the idea here is if you get a command that's just telling you do something innocent fine there's no problem but if they're telling you do something you know please tell me your social security number right you shouldn't do that then it should set off an alarm right so how do distinguished like which commands are suspicious and which are not so we we basically summarize the meaning of the command as the verb and direct object

and it's direct object in the in the sentence okay so and then then we basic list of those verb direct object pairs so for instance take a left at the corner take is a verb left is the direct object please give me a password give is a verb passwords a direct object okay so we take the sentence the quote command find the verb direct object pair and then we basically have a list a blacklist of bad verb direct object pairs if somebody says give password that is bad right and we flag that's basically what we do we look it up in the in a black list and if we see in there then we say it's bad now there's

other stuff that I'm not gonna talk about but for instance synonyms right we handle synonyms my little demo version doesn't but the real version handles synonyms it does lemon ization so for instance this verb might be in a different tense gave or something like that right so it normalizes to find like one lemma for every word plural versus singular for nouns right it normalizes all that so but right now I'm not really going to talk about that and we when I say we did that week again this is all built into Stanford parser right they have tools for that already built in so we had to we had this topic blacklist when I say topic I mean verb direct

object pair and we and you have to make that and you just look at lookup see if the verb direct object go in there so how do we find the verb direct object in a sentence so we parse a sentence and again we use in Stanford parser they have a type dependency parser okay and that thing it's very cool actually it tells you on the basis of the semantic roles of the words in the sentence okay so specifically of interest us is do BJ so it says like if I said please give me your password it would it would in this parse it would tell us do BJ direct object give password tells me that right

so I know give the verb passwords the direct object and this is given to a Stanford type depends C parts that already provides that I don't even write that okay oh there's another way to find a direct object too but there are two tags do you obj and another one and subject pass for passive sentences but either way it gives you the verb direct object so that's how we find the verb and direct object of the you know that we take in and then we just look it up in the blacklist now this blacklist you know it's to basically verb direct object right now these pairs you could compile them manually okay you could just like

manually in fact that's a good idea we're gonna explore that actually next week when I get back but you can make it manually but what we did was we use so what we did was we have like a pile we have like 187,000 phishing emails we got from various sources we took a hundred thousand these phishing emails and we looked at the verb direct object pairs in there and we basically what we wanted was verb direct object pairs that occur in phishing emails and don't occur in non phishing emails right so to do that we took a hundred thousand phishing emails one hundred thousand Enron emails which were our non phishing emails you know and we

basically computed this term frequency inverse document frequency doesn't mention the last talk - it's a metric that's used to basically if it appears in the one set a lot and not in the other set then it this gives a high value okay so if you see it a lot in phishing not in the and the non phishing then it gets highly ranked okay so we did that for and we got basically we just had some cutoff relatively arbitrary cutoff point and everything above that we said okay that's our blacklist okay that's what we did there are a lot of cases where that's not the best thing to do like you would want to do this manually you can do it manually

too okay so anyway we got the top of blacklist demo okay I got the demo really cool now let's see if it actually works you saw the video in case it fails okay so a little hard to read but this is a virtual machine it's a bundu virtual machine instance so there's a couple this four windows in here this is just my directories that I'm gonna need late I might even need it's open I this is where the actions gonna go on this is what I'm going to type in the sentence and it's gonna tell you if it's a scam or not down here these two are related to the question-answering so this is the

question-answering tool it has its own little server there and that's running in that window and then this guy is the the parser core that stanford core NLP parser and this guy talks to this guy so when this guy needs a question it needs to answer a question it sends it to this guy which parses it and it sends a parse tree back and this doesn't so but most the actions going on over here okay so let's try this let me go over here okay let's ask something innocent just make us sense hello there okay so that's not anything right and it says okay it says normal sentence fine okay and the OBG pay the object the direct

object pair it didn't find anything because it's not even a command right the commands are where we need that so let's give it a command that's malicious what's a good one oh now I'm blanking on my example I was gonna use up give me money no I don't think that's I don't know if that was in there see basically I have a limited black list I'll show it to you in a second okay I got to use something out of the black list so let me stick with my plan right but you could add you could easily add that to the black list but for this demo I have like eight pairs or something anyway so

let's say give me money and it's slow I don't know why but it will quickly yeah okay that's right okay scam detector yeah good yeah now if I say give me a car car it shouldn't it's just a normal sentence okay so notice it says you know oh that might be bad too but but I can put it no but I can make it bad hold on so give me money says you know give card said okay give car that's not in my black list so what I'm gonna do is modify my black list which I have open right here this saxy this JSON file this black list on that's it so I will open

it up of course customized your blackness right depending on what's available too so you can put whatever you want there so yeah that's that's exactly right like the idea is that if you were using this you know at your company you would put in verb direct object pairs that are relevant to what you peep what you think people would want to get out of your out of your employees you know I mean so you would make it by hand but here's his sample one that I got and this so you can see give money give resume oh I actually had send money in there what do you know borrow money whatever so I'm gonna add

there's all those OHS ignore the oh those are options that we haven't really implemented yet so let me put and they give car in there okay so I'll just add give car and then I got to put the O's in there there well there's there's this option some yeah like tf-idf is supposed to be there's a bunch of things that we're gonna put in there one soon that are not there right now but a bunch of numbers are gonna put in but yeah right now okay so yeah I think I put that in there so let me save it okay so now close up okay now I'm gonna quit the program started again so it

reads that blacklist again now I should be able to say give me a car and car it slowly will detect that is a scam yes yeah it goes yeah scam detective okay so it's as easy as that you can just add these verb direct object pairs in there it's a cater to your own needs you know question if it could I do most of the time and the and the the car so give car is car car we still have to be the direct object of give can you give me a sense I'm willing to try it may fail

nobody that's malicious busy okay so just so you know this demo is fragile not unlike the real thing you said cars use it cars with it s it will lose it just cuz of that I in the real thing that is on the gitlab but for the demo that we scraped together quick it doesn't do it oh I'm sorry I think he was next and then I'm sorry I think you an EXO I want to screw you over but um I work I used to work for German company and the Germans like you should really hate speaking to English people because of the way we the semantics in which we talk about I've written a few sentences down here

right would you by chance know the are you able to tell me the and I want to screw you over because this is like really awesome and you're trying to squeeze it seems like a really reassure like amount of time and they're quite complicated sentences so you know just kind of like you know this is gonna guaranteed to fail on yeah sorry dude sorry what was the first one just out of curiosity would you by chance know the you know the password associated with the account car facing thanks this demo will not do that but the actual one would so would you that would it'll detect that as a modal and it would catch it but their most of the ones

you're gonna say it won't okay but let me that's alright your next and then you so you go ahead straight behind you straight behind you like way back like straight back there you know he is at the wall that one I'm a ghost that's exactly how I analyze which is how I would like this next question to be treated if we can here is an example sentence from a real-world case I am in need of your company name API keys can you please provide I need your it would say need and then keys I mean oh you said could you please provide it okay you would okay so one thing it doesn't handle right now is

pronouns okay so could you provide it that could be caught if we handle pronouns now pranaam arrests like a known thing in laughing processing I'm not doing it but if I were doing it it's like I basically I wouldn't make it up like I could find an off-the-shelf technique and just apply it but then it could handle that yes yeah no he said it yeah yeah so the caught in the context there actually the word it wasn't even there really yeah yeah it was like could you pick it was can you please provide oh yeah and yeah then it wouldn't ya know that so when you start talking so basically this is limited to one

sentence at a time to understand that you got to connect two sentences yeah and we don't do that could I mean can simply get it simply handle the phrase I am in need of API keys well it would say it would get need API keys yeah and then if that was in your black listened it would catch it got it yeah question first I just want to say this is uh this is really cool and I feel like the audience is collectively trying to like figure out ways to break this I know it go ahead there's there's I can think of like lots of really cool ways to use this not just for like detecting fishing but also adding on to

like bad user behaviors like I'm thinking like you take your chat feeds internally like through slack or whatever and you want to stop people from sharing their passwords with each other sharing API keys with each other in like a chat forum like slack like this would be a cool way to see employees behaving badly yeah you know there's all sorts of evil uses too but I won't go into that like I'm like why why isn't the US government already doing this which maybe they are right so my question how how are we handling and not in the demo version but in the real version how are we handling like real social engineering like targeted social engineering in

which we are creating a scenario that seems realistic where I'm like hey I'm system admin blah blah please update your password password I catch that for sure update password yeah meaning they're targeted aspect of it one of the beauties of this is that all that complicated stuff that you do the pretexting elicitation all that it doesn't matter we're not even looking at that we're just looking at the punchline the end where you're like look give me the data or do this so does this flag on things like actual security communications where they're like hey you guys need to update your passwords yes and I'll get so the okay so let me just before I get to your question

basically the problem is this has no idea who is speaking okay so it assumes this is a foreign person speaking if your mother asks you for a hundred dollars you should give her the money maybe but if you--if it a random person ask you've been that suspicious right so we would have to be able to identify the speaker and that's part of what you're saying and we can't do that maybe there have there are other techniques to do that type of thing that we will investigate yeah you have to add on top go ahead you had a question right there in a blue it's kind of similar to the last two questions kind of a combination so I

know that you just recently just said that you only look at one sentence and going back to telling kind of real-life if I was to show an engineering can can you give me your path for like say I bury the subject nine in the noun or those on the object in multiple exchanges let's just say if I say hey what do you think is a secure password and then that's a question like symbols and stuff as I well do you think your sister cur how do you set yours what is it so okay but so okay what is it right there's a punch line right and if I could resolve that it to back to the now

which is easy then I could detect that what is your path then it would I could substitute the it for your password and then I detected do I write and that's an example but like say if you were to keep this memory and how do you like say if you're just having a natural conversation it would be fairly difficult to trace back what the it is especially if you're talking about multiples right so I okay I disagree with that so this problem of tracing back what the it is they call it phenomenal resolution and naturally processing people have been trying to solve it for decades you know I'm saying they do a pretty good job of it I mean I didn't implement one

of their techniques but so I don't think it's as hard as you're saying I think people have looked at that problem a lot it's a like something that you guys are looking at like yeah I'm gonna add it so I just got a nice fat grant to do this and I'll be doing all that you know okay thank you yes

which is to say because your is I understand what you're built what you talk about here is a simple solution it because it's simple it's not baffled by all the pretexting that people are talking about and you know complex awkward sentences also have the problem that a real human being misunderstands a-- and so that if a social engineer really if you know wants to know for sure that the password they've just been given is in fact the password as opposed to something else they have to ultimately be clear to the human being and declare at the point that they're being clarity that's your punchline and that's what you're triggering on mm-hmm and you know let me let me just say

something too and to comment on that right so uh so this guy Chris had naggy you may know he runs a social engineering village at DEFCON right so X I published with him earlier talked about this a lot right and he wants to use this to Train social engineers to be better okay he's like yeah you try things out if I detect it then they got a reward it in a new way so I miss it right and they could do that right Nick and there can be this back and forth just like the malware detection right and then I can shrinks this to be better and they can change their to be better but mine has a finite limit meaning

humans only under you can't just might write random English sentences it's got to be something understandable by a human you can't write arbitrarily complex bizarro sounding sentences so there's like a limit to how much that can go on before I win and I could detect every way a human could express something or this is my claim anyway well with malware detection that limit I don't see it you know what I mean question oh yeah I think yeah okay oh sorry he he's not after you I'm sorry yeah you go ahead you go ahead okay okay okay yeah so as a social engineering uh I work in Brazil and security our awareness there is very non-existent

so actually actually this works a lot there because most of users are really just way stupid then when way more stupid than we can possibly think so it they handle passwords without you asking for it it's like doing dude it's awesome doing a social engineering job there is like the easiest thing ever because the security guard gives you the password to the door is ridiculous but this and it's mostly not because people are stupid but because they never had contact with awareness security programs and such so this this could be very helpful for stupid users for say for saying but because it generates an alarm it gives you the alert that you know this phrase

has password and gives and give together you should really think about that you know because you're kind of being stupid so you should consider that so it helps be bringing awareness more present to the days of everybody else who works with the confidential like pass the confidential infos like passwords I think this problem is knowledge Brazil I mean I'm from Scandinavia that's a similar I mean it's like you say that the world is so low it's everywhere you know and yeah so whatever we can do sort of healthy situation right so yeah but he has to go next cuz they we skipped him go ahead I would like to give you a car I would

like to I don't know I only try it real quick

normal sentence it didn't detect that uh yeah wait he's next though and then I gotta go go ahead I'm back alright so um one other tactic that I'll see as part of social engineering campaigns that I've run is I will genera sighs a question as much as I possibly can so and and what I often see happen is I would feed I would feed some information that I know that makes it appear as if I have some insight into this into some secret and then from there I will ask a generic follow-up which is what other information do I need to know or just more blatantly what else might I need to do to do X I might

not even say to do whatever I might just say what else might I need and then at once I've confirmed something that might appear semi-secret but isn't really that's the point at which the person the mark may start to vomit information right back at me because they feel there's some sort of established bond of trust I feel like this isn't necessarily designed to speak to that but I don't think that's necessarily the point if I'm understanding correctly this is definitely intended to sift out the the broad spam phishing attempts as well as some of the modestly more intelligent ones I'm thinking that somebody that's exceedingly motivated or somebody that's on a particularly motivated whaling mission might still be able to pass this

thoughts yeah I mean there's a lot of ways you can sort of bypass this remember it's it's kind of a simple term all right like you said you can even say something on purpose incorrect and they will you know say the correct thing back these sort of things right so that's this wasn't sort of designed to capture these sort of sort of very sophisticated sort of examples right but so yeah like I know there's another question but yeah like like for instance I got in one of these slides okay so for instance so long lines what you're saying just we're detecting questions in commands so just don't ask one suggest it so you could say what is your password or you could

say I can reset your account but I'll need the pass first you didn't ask a question but they're gonna give you the password you know so that's a long lines what you're saying that is more sophisticated than I could deal with because you have to understand the mental state of the person I can't do that yet them all you know he's behind you is quite right there so I just wanted to so I also run an offensive security team of some sort and I live in this road as well and I just wanted to make a comment and kind of respond to what you're hitting on it I think the point here isn't that we've

designed the perfect system to solve social engineering right it's that we now have perhaps a tool where like in like without this what is our defense against this it is relying on the human to even to catch the most basic dumb stuff someone going give me your password and I'm going K like that is not a solved problem right now and we try to use awareness and things like that to push that up but if we could technically solve it which this does we could at least like raise the bar so that the dummies don't get our users passwords and they at least have to freaking try hard right yeah which is the idea of security make them try

harder right so and I think this I think this is a really cool thing so was there another question thank you I think oh yeah we we got another five minutes left that's fine I don't have to finish the rest of talk this is why I had to get to the demo the rest is just fringe stuff go ahead I'll be quick so you had the the slide that said you know you input something and then you test it and then you get the result and then it goes back so we put this in that oh hey it's not working so that yeah I had that slide it doesn't it doesn't do that so much meaning but what I mean is

what all we do is we take we do exactly what is those shows right it takes it takes text and it prints out text but you could easily hook it up to instead of putting out text you could hook it up to a screen and put a warning on on the in front of you on the screen or put a message in your ear on a cell phone something like that but it doesn't do I haven't closed that loop but that could that's sort of an easy thing to do I think okay okay I'm gonna go on okay so basically just to say Raziel says we did we tried on a lot of phishing emails we found from these

different sites and here are the results now they're not that good actually but this is starting place right so this column is for phishing emails and this calls for Enron emails okay and this is how many we classified is detected as phishing or it's you know malicious or not detected okay and what we want this is 87 thumbs so we had like one hundred eighty seven thousand males of each type a hundred we used to train so eighty seven thousand right is what you get left so we would like it if all the fishing were detected so this is eighty-seven thousand and all the n lines were not so this is eighty-seven thousand so these are bad right and

they're not zero okay so I'm gonna try to explain some of them let's start with the false negatives the phishing emails the fish so thirty five percent of the phishing emails were not detected as phishing so why did it fail so badly in that sense so basically because what I did was I had the students look at a hundred of the of these false negatives and manually figure out why they weren't detected and what happened is so they say all right we just detect this punchline okay the question of command but in these phishing emails you a lot of these phishing emails in the data sets we looked at they don't have that they they're just the start of they just

say look hit me back okay that's like the most suspicious thing they say now so we don't catch them right and that's like 35 that's not 35 cent that's like 80% 79 percent of the male's of the ones we didn't catch we're just like that in fact um like here's one this is a subset of it right dot-dot-dot but in the end it says please you can call us or reply us whatever right i but we didn't catch that as being malicious now i would argue that eventually in this fishing in this email conversation they would have asked a question or made a command and we would have caught it then okay so i don't feel too badly about the fact that

we missed all these false negatives because i think it was just a bad dataset in my opinion oh and I mention too in the side before I run out of time that is there something you want to say okay I'm talking but no seriousiy but so one thing we're doing so one problem that we've had with this work is people say look you're saying you can detect in-person attacks over the phone but you don't try it out on INE you haven't evaluated on any that's because they don't exist publicly okay you can find phishing emails but you can't find conversations that were that are fishing so there is social engineering so what we're doing with my

wonderful grant that I just got is we're wouldn't gonna run an attack you know social engineering village you know so we're gonna do that right I've got a proof I got it approved well I'm gonna pay students 15 bucks and I said look in three months sometimes three months you're gonna get an attack right we're gonna call you up try to scam you for and we're gonna tell them explicitly this data or this date or whatever but they're gonna forget right and two months later we'll call them they will have no recollection of what I told them and then they'll get fooled and we're gonna run a bunch of attacks and see so we can get a data set we can say look

these attacks work and these don't you know we're so we're gonna do that actually next quarter question I mean I you know oh stop I got a I got a stop I'm sorry oh wait can we ask question I have questions when you say we're running out of time but I think we you can squeeze one question and okay go no no so do you have any kind of concern that students these days don't actually get calls anymore it's a parent fact that you're calling them may already be suspicious hmm I mean I think it's gonna work yeah we're gonna some people getting calls I mean I don't think it's that bad kids today you know maybe look we're gonna

try it that I mean look what I got a proof is what I said we'll spoof a I call our IDs you know and we'll make calls like that okay I guess that's tough because I sit down so yeah so since there's the last talk let's take another five minutes for questions okay any questions we might be done with yeah I'm curious if you've tried combining this with any kind of sentiment analysis and seeing how that might play I have not so sentiment analysis I looked at sentiment analysis but I for what I'm trying to do I don't think I need it it's because you know sentiment is basically plus/minus as far as I understand it's like good bad right but

some of these things like give me money is that good is it bad you can't even classify it that way so I don't see how it's useful for this yeah question ok she's that question right there actually is not a question is more like a story that I have that you may consider in Brazil actually calls are very common and are so common that people don't think twice uhm believing it so things like hey you just want a car give me your credit card number so I can credit to you and they and they'd say they do give the credit card number are you saying Brazil should be a target of art

and we need that very bad because uh months ago my coworker came to me crying with money in her hand like very desperate Oh marina the doctor just called me my husband is in the hospital and I need to pay for the transfer and I'm like very desperate can you take me there I have to take the money to the doctor I was like wait take your money to the doctor what the [ __ ] call call the doctor and she called the number that called her and the guy insert it and like I'm like who was talking oh you know I was working with a guy and he got an accident and I'm in

the hospital with him then the doctor is very desperate they need to transfer him and like okay let me talk to the doctor please and I was in this other guy answer it and I why and I how is the husband doing how you know what happened and the guy answer me you know he just liked how the his leg is kind of hanging it's very [ __ ] bad you like just about the money I'm very [ __ ] bad yeah gotcha ease which she was like okay I see what's happening here but that was a good try man and that that happens it's not okay it's gonna work very good okay so I guess we're

done right now all right thanks thanks for thank you so much [Applause] [Music]

[Music] [Music]

[Music]

BSIDESLV 2018 - Ground Truth - Day One

Related talks