Gollum: One Anti-Phish Bot to Rule Them All

Name: Gollum: One Anti-Phish Bot to Rule Them All
Uploaded: 2019-12-11
Duration: 45 min 10 s
Description: Gollum is a Python-based phishing detection and response system deployed in production at a major South African bank. It automatically detects credential phishing targeting the brand by analyzing client emails, website referrer logs, and anti-phishing feeds using image recognition and HTTP analysis.

BSides Cape Town · 201945:101.7K viewsPublished 2019-12Watch on YouTube ↗

Speakers

Byron Rudman

Tags

CategoryTechnical

TopicDetection Engineering Threat Intel Tooling

TeamBlue

StyleTalk

Mentioned in this talk

Tools used

Ansible AWS Web Application Firewall Google Chrome MITMF nginx Shelob Telegram Terraform

Platforms

AWS API Gateway AWS RDS

Service

Amazon S3

Frameworks

Django TensorFlow Unicorn

About this talk

Gollum is a Python-based phishing detection and response system deployed in production at a major South African bank. It automatically detects credential phishing targeting the brand by analyzing client emails, website referrer logs, and anti-phishing feeds using image recognition and HTTP analysis. The tool includes a Telegram bot for reporting, a web dashboard for statistics, and is deployed on AWS using Terraform and Ansible.

Show original YouTube description

Title: Gollum – One anti-phish bot to rule them all Abstract: Credential phishing scams cost companies and the general public millions of Rand every year and also degrade the reputation of targeted companies. Gollum’s purpose is to detect, report and track phishing sites. It does this using website referrer logs, anti-phishing feeds and client reports to automatically find phishing targeting a brand. These sites are then reported to anti-phishing feeds which in turn populates commonly used browser black lists. Gollum parses links and attachments from client emails, which are used to classify the email as phishing. This is achieved using image recognition applied to screenshots taken in a headless browser. Gollum also makes use of HTTP GET requests and the html of the website to decide if it is phishing. Phishing sites which fetch content from the target will appear in the targets referrer log. This is an additional resource which Gollum utilises to detect phishing. Gollum includes a Telegram bot to facilitate the reporting and verification of phishing sites. Additionally it includes a web front end for viewing statistics and loading phishing pages in a “sandbox” environment. It is written in Python, is hosted predominantly on cloud infrastructure and is deployed using Terraform and Ansible. This tool has been in production at one of the large banks in South Africa for several months now and interesting statistics have been gathered which will be shared in the talk. Gollum is also set to be released as an open source project. Speaker: Byron Rudman Twitter: @ByronRudman Speaker Bio: I Studied information engineering at WITS and started working at Absa bank in 2015 in the cyber security team. For the last year I have been the lead developer on a SOAR tool, automating cyber security processes. My team and I use a DevOps life cycle when implementing solutions which I plan on including in the talk. I am in the process of obtaining the OSCP certification and will (cross fingers) be getting it mid next month. I have a passion for automating cyber security processes and believe that it will become mandatory in every cyber security team as adversary’s become more autonomous.

Show transcript [en]

so it's Gollum [Laughter] the Lord of the Rings character okay so thank you for coming to the talk I know next door's got a cool talk so I appreciate you guys coming to this one so column-one anti-phishing bots are rule them all this is a fishing bot that deals with reported fishing to our bank so I'm gonna get started so who we are so that's me this is Benny and he's over here and so there's our Twitter handles I've got about five followers so I'll give my sixth follower a free beer my seventh we'll get two free beers and so on and till I run out of money and that'll be sooner than you guys think

maybe like three beers yeah and so we both work at Absa in the cyber security team we work on automation on a tool a group of tools known as a saw tool we've only got one so that stands for security orchestration automation and response and we built this bot together okay so what fishing are we talking about here and why was this developed so the fishing that we're talking about here is credential fishing so the company being targeted will be there website will be scraped in some way and a clone of that website will be created then a phishing email will be sent to a client of this company they'll open that it'll open in

a web page they'll fill in their credentials and then their data is stolen so that's the phishing that we're talking about and why was this developed so Absa is a huge target of phishing and when you report phishing to Absa it goes to a mailbox so you reported via email you just forward your fishing mail to Absa now this mailbox it was manually monitored right so it would take several days for you to get a response if you reported an email to this mailbox and also you know we want to take these things down as quickly as possible so if we're only responding to you a few days after you sent us a fishing mail that means that

websites are so only being taken down a few days after you knew about it and after we should have known about it and action death so we've automated that mailbox and also automated other ways to detect phishing which I'm going to get into so we've got a methodology that we came up with to into in deal with this phishing problem so we've got detection classification confirmation reporting and tracking so under detection we use client emails that go to this mailbox that I was talking about we pull those emails and try classify them next step then we use referral logs so I'm going to go into what a referral log is and how you can use it to detect phishing

and then we use anti-phishing feeds so there's a lot of anti-phishing feeds on the internet then we classify so we use the HTTP GET request the HTML content on the page and then image recognition - we screenshot the page and then classify it that way as well and then we need to confirm because all of these can be error-prone so to confirm we use a telegram bot and you can also confirm in the application that we developed which I'll show you a bit later then reporting so as we know about something we want to report it immediately because we want to get it blocked on browsers and we want the website to be taken down so we

report to enter phishing feeds the main one the most useful one that we report to is fishtank and then we report our NC phishing vendor and that's because we can't always take down the sites ourselves because there's different geo locations that these sites are in so different languages get spoken in those countries so you need to be able to effectively communicate to get the site taken down and then we track so in to end we track phishing sites we want to make sure that they stay down and don't get reactivated because the sites that get hacked to host the phishing sites often don't get patched so the phishing kits gets deleted but then it's up there

week later again okay so the architecture that we use so we've got Gollum which is on our network and then we've got Sheila which is off the network so these are two ec2 instances so two and separate servers and we use golem as the brain of the application and Shelob is basically a Toastmaster so so golem sends requests to Sheila via an API and then she lab will do the web scraping that we need so she lab is a spider and web scraping so that's why we chose that name and golem for you Lord of the Rings nerds loves fish so that's where we got so we'll get to that y'all so can we just save the

questions for the end I'll once that's in the Tokio so golem doesn't need internet to communicate or to do its job it uses Shelob for that so golem is on your network and basically if someone gets into a network they can get to golem but otherwise you don't want to visit fishing sites on your network so that's what she loves for so your proxy isn't being fooled with a bunch of requests to phishing sites and both applications were written in Python and we use the Django framework so now I've the methodology we're going to go through so in the top right you can see which one we on and also down the left hand side so we under tech now so the

first method of detection that I mentioned was via email so there's a client busy reporting a phishing mail and now we need to say do we need internet to read this email or not so if we're using Microsoft Exchange then that email is going to be on our network and then we don't need Shelob to fetch the email but if it's off the network then Gollum doesn't have internet access I mean if it's on the network then Gollum can just fetch the email so you can read those windows in your best Gollum accent in your head okay the second method we've got for detection is using the refer log so this is basically the web scraping that you were referring

to so when phishing sites is spawned if that phishing site or fishing kit wants to stay current so always wants to look like your website then they've got to pull dynamically pull the content from your website so just about what a referral log is it's the log that tells you how the person got to your website so if you google a website let's say PayPal for example you google PayPal and then you click on the PayPal link inside of Google then on PayPal's referral logs they will see Google over here so the referrer that referred you to them is going to be Google and then all of your details are also in that referral log so

we've got your IP address we've got your user agent and the date and time that the request was made so this is very useful for detecting phishing but we can also use it to protect our clients so if we take your IP address we take your user agent and we take that timestamp we can correlate all of that data against our internet banking logins so we can say oh that IP address was seen logging into internet banking with that user agents at a similar time to that so now we know with a high likelihood that that client has visited a phishing site and then fraud can contact that client because we'll have the session data associated with that IP

address at the time which is quite cool then we've got the the request type so the HTTP request that was made and you're going to get a lot of guests gets coming through not many posts because you know it's trying to pull content from your site and then the file type is also very important because that's going to be static content so if it's pulling things like images JavaScript CSS that's kind of static data is what we care about for phishing and then the last thing is the phishing page so that refer log over there actually tells us the web site that was pulling this data so when a client visits the phishing site they

generate this log in our logs and then we can see that phishing site so now we've got the phishing site because a client visited the site the fish insite all the frauds to spawn the site up visited the site so now we've actually got the website now we need to do something with it so this is like our architecture but I'm sure most companies would have this so you've got your website and then the logs would be come into your seymour log aggregator and then we've got Gollum over here that makes an API call into the seem to fetch the refer log so basically it's just a search in whatever language your seem users okay so onto

classifying emails now so we pass so an email will come in to the mailbox and we've got a script that fetches the emails every minute and puts them on a memory cue and then we get the HTML attachments off the email and we get the links out of the email just we just pass that data out of the email then we load the data into a headless browser so a headless browser is a browser on a server that's not connected to a screen hence headless and it still can take screenshots though so you can load a page in a headless browser and still take a screenshots and people use them for like unit tests and that sort of

thing as well so we get the HTTP status of the website there's no point in even starting this if the site's already down if it's a four or five hundred or something then we get the HTML content so this can tell us the structure of the page and the structure of phishing sites also is similar they a vary a lot but like a whole group of phishing sites will have a similar structure then we get the get requests like we've spoken about those are important and they will get a screenshot of the page and then the screenshot is sent to an AI model and we trained this I'll talk about that now and then the AR model classifies it

as being a specific brand so if it matches our brand that we know this phishing belongs to us so weird on AI so we're using transfer learning to train our model and transfer loading is really cool because it's basically a big neural network the one that we're using is from tensorflow and they've got a model that can basically do like widespread recognition so can recognize humans cause dogs people's faces and we take that Network and we remove the last few layers of the network so the lost before the classification and then we give it our data set and say retrain and then it retrains and just makes up the last few layers so the previous layers are all about feature

extraction and the last few layers are all about classifying the data and why that's really cool is you can use a really small data set but still get a high accuracy which is unusual for neural nets you usually need a lot of data to get a good accuracy so we also have the black box of AI so word on the street there's no theory exists for why these models work so well so we know they work because I mean they're on everyone's phone classifying your face but we don't always know when they won't work so there's a lot of guesswork involved with what data you need how much data you need and they can be

error-prone and quite buggy so pornography here's a weird one to have on the slide but basically we've been running this bot for about a year and we've been taking screenshots of phishing sites or websites that are reported to us via email and you know spam comes into the email and we sometimes get links that take us to websites that you wouldn't want to be on so our BOTS has taken pictures of a few nude woman and then it would ask us please classify this right so then it sends the whole team a picture of a nude which some people would really enjoyed in the team so we find out those bugs and that's basically just the data set

that we chose in the beginning we didn't know what would be sent to us right so we didn't have pornographic images in our non phishing data set we don't now either but we've got at least pictures of people okay and then classifying the referral logs so that's the log that was in a previous slide we writes a regex to extract the pieces out of that log file a log line and then we get the referrer and then we get the HTTP status of the referrer so as well if it's not 200 we don't carry on or under 300 and then we get the get requests and classify the get requests and we can't use image

recognition at this layer because the referrer is maybe a CSS file which isn't going to look like our brand it's just going to be a lot of plain text in the browser so the get request tell us what content that CSS file is fetching from our page and then we've got a threshold for if it fetches enough static data from our page then we know okay this is very likely to be fishing this thing we advance to the next block where we tried guess what the actual fishing site is so if you know if you're familiar with Derby stall go Buster and wordless for brute-forcing we basically have our own word list that brute forces for the fishing actual

fishing page and we just get 4 4 s 4 4 s and then we get like a 200 on the page and then we load that in a headless browser screenshot that and then classified in a similar way to the way we were classifying and the email ok then on to verification so we've spoken about classifying now we need to verify whether these things are in fact fishing and lots pornographic pictures so and they used to confirm a fishing page and this is what our bot currently looks like so it'll send you a picture and a URL and then ask you is this fishing was this not fishing and you can report fishing to the BOD as well you can

receive statistics from the bot so we've got a use cases like a manager might want to see statistics how well is the application performing and then the devs want to see errors in the application so we send the error log Star Telegram bot as well but just for the dev users and then we our analysts confirm whether pictures of fishing or not ok onto reporting so like I said we report all of our fish to fish tank and then also to our enter fishing vendor and fish tank unfortunately doesn't have an API for submission they've only got an API to pull and data from then so we use web scraping and we logged in with an

initial user that will report the web page and then we've got for verification users that log in after that user and verify that fish because yeah so it's like sneaky button so this is what Google Safe Browsing does about 30 minutes after we report the site we get this showing up on all of our fishing sites which is pretty cool sometimes it doesn't always get marked as fishing even after our verification thing but we've got about a 95 percentage rate on that and then our reporting to vendors we supports everything at the moment except oh so we'll get into that later in the talk then we want to track the fish to make sure that websites go

down and stay down weary weary visit these websites so we keep them in the database and visit them you know if it was up they more visit at every hour if it was down we'll visited maybe once or twice a day and and if it's down for a long period of time maybe once a week we'll visit it so we check the HTTP status that's the easiest way to tell whether something's gone down but sometimes it stays on a 200 for example when our anti-phishing vendor sometimes just will put their image over the page to take it down so this content will stay 200 but the phishing contents actually not there anymore so we also

have to look at whether the HTML content has changed the up sites will be reactivated so you might visit a closed site and now it's open again like I said the websites that are hacked to host phishing on two ways patched so we see that quite often and we've seen our IP address and user agents blocked as well you know so in the fishing kit we when we analyze the fishing kit will say like oh those are our IP addresses in this list of IPs that are getting blocked but you know that's not unique all the answer fishing vendors IP is also blocked so that was tough to use proxies so we have a script that fetches proxies

every morning for us just open-source South African proxies and then open-source probably the wrong word but free South African proxies and then we use those proxies and use randomized user agents to visit the site okay so that's the end in to end how we deal with phishing I just want to talk about the architecture before the demo so this would be your company then like I said Gollum doesn't have internet access but is in a V PC that's connected to your company and then we have a non connected V PC that has Shelob in it so that Shelob can visit the internet for Gollum so logs for a fir logs come into the seam and then we use Microsoft Exchange so a

client will report an email to us and then Gollum will go and fetch those emails from exchange every minute that's just a cron job that runs and those emails are they include in a memory queue on column and in column will pass the email and extract the links HTML attachments send that to Sheila and then Shelob will do all of the analysis that I was talking about and then send back the classification back to Gotham if the if the site is classified as being absurd then Gollum will take the screenshots and HTML content and get requests and stored in the s3 bucket and then we'll also create a database entry so we're using the relational database

service from AWS just my sequel database and we've got some web application firewalls and then there so that's the egress one so we can only talk to telegram Gollum actually or telegram scripts always pull telegram even if you send telegram a message it's probably a web book it's so so it always will be polling telegram so telegram never ask to push anything to Gotham if you request for stats or anything like that so this is all one-way traffic no one can talk in there and then this Web Application Firewall only only will accept connections from that VPC an info authentication this has got Django rest framework on it so it's using a token and then the APA API

gateway from AWS also has got token authentication we also intend just to finish so the audience database now has the phishing email in it and then a telegram message is sent you know saying is this fishing when someone clicks yes what happens is Garland then will tell she log to report this phishing site and also before that verification the client also gets responded to so if we've seen that fishing before then you'll get a notification very quickly that this is a phishing sites been seen before but if it's a new phishing site then we'll send you something that says like we think this is fish you should be careful with this email okay so we use terraform to deploy our

infrastructure it's actually a requirement at the bank to use terraform and ansible to deploy into AWS so basic what it does it's a bunch of configuration scripts for your infrastructure you run a command on your computer and it takes all of those configs and says okay this is what you want to build in your AWS or Google cloud platform or Azure so it's cross-platform and then it'll also tell you like this is what I'm going to build and then you can agree to that and accept and send that through so we've got terraform scripts and that everybody can use for managing the infrastructure as well then ansible to automate deployment so it's also configuration scripts then it connects via SSH to your

infrastructure and then we'll deploy code for you it'll run commands to install software so it'll install I've got the list at the bottom here so we installing nginx and SSL Certificates using ansible and then installing django we're using virtual environments and unicorn installs all of those and then it installs chrome and MIT's em the man-in-the-middle proxy we're using to get the get requests that are made and chrome is also installed and then basically when everything's finished installing it starts up your application for you so after you've run your terraform and ansible scripts the whole platform is up and running okay time for a demo so I've bought my fiancee here as a sacrifice to the demo gods so if

everything goes well we won't have to sacrifice oh there yeah so this is the oh thank you got to drag that over done that warming once okay so this is the platform that we developed basically when you deploy all of this with terraform and ansible then you will be greeted with this as the welcoming screen so it's the first screen to onboard your company onto the platform so you can put in the company name and company website and when you click Next it's immediately going to start training a model for that website so chrome is going to go visit that website take screenshots of what the website looks like and then duplicate that into a folder delete the old model

retrain a new one and then you then you've got your company on-boarded and what's cool about that is you can on board multiple companies onto the platform so if you like a bank like Absa we've got other banks in africa we look off to some of their security as well and they're fishing so we can onboard them and their mailboxes onto this platform so the model can classify more than one company and another thing that's cool about that is if you've got a competitor's fishing email so you know where to report competitors fishing mail to you then you can onboard your competitor and if something is classified as your competitor then you could just send that email straight off

to them you don't have to send it to an analyst mailbox so you're also saving them time that way okay so I'm not going to train a model now I was going to but what happens is the that initial model that you get from tensorflow is downloaded and stored in the temp directory and then I shut down my laptop yesterday which means my temp directory is gone which means it's gonna have to redownload that models I'm not gonna do that but I've got a few things trained for later so I'm just gonna say skip for now this would be the referrer log onboarding so whatever seen using you'd put the same details in here and and

that would be your search string to pull out your referral logs and then you can put in your directory brute-forcing guesses in there as well so the API so we're using risk IQ and their API for getting the Whois details SSL information and some open source threat sharing information about all the websites in the platform so all that data gets stored as well what's very useful about that is you can see the registrants email sometimes in the Whois details so if a fraudster spins up a website and they do put a fake email in there that can help you correlates to other phishing kits but also if it's a hacked website which 90% of the time it

is then you can get the hacked website's registrants email and send them an email and tell them how your website's been hacked so email configuration so just inbound and outbound email and the outbound email is it uses the same credentials as that so this is actually for the analyst email so you'll have an email coming in if you can't process it you need some way to send that emails that's what those those are for so that someone else can look at that then we've got fish tank details over here so the reporting user and all the verification users and then your anti-phishing vendors details and like I said it doesn't support OAuth yet so you'd put in your username and password

would help with OAuth authentication but on the back end it's not there yet No okay so that's just verification to show that everything was successful now I'm going to move on to the sandbox okay so let me show you okay so this is not Absa this is an HTML attachment that's on my machine and this is just a phishing site that I created so this is very typical so someone will email you an HTML attachment in the email you'll click on the attachment you think it's your statement you fill in your credentials like this and then it doesn't take you to your statement so the password click Next and now those credentials were actually just submitted

to at that site so now there's a PHP script there it's not an HTML attachments anymore and so this is just an image which is also very common in phishing sites so I'm going to submit that HTML attachment to the sandbox in Gollum so that fish dot HTML so that's the phishing website no this does take a while what's happening now is Gollum is talking to Sheila it's sent at the phishing link it's going to load that in a headless browser take a screenshot of that and then bring all the details back here with what it thinks is the actual phishing site because you can't report an HTML attachment either you can't report that to fishtank because they

don't care about HTML attachments they want the link that's the phishing site and you can't report it to your answer phishing vendor because then they first have to figure out what the phishing link is like you should be doing and then send that straight to them so there's the screenshot of what that HTML attachment looks like and you can see here this is the received creds PHP file which was where the credentials were posted to so if I click on extracted links you can see it's a two hundred and basically that's how we found that link and then I'm going to show you another example so pay pols login so PayPal I don't know if it's still

like this but when I started working a few years ago there were huge targets of phishing so thank you so we'll wait so what's happening again now is that link is being sent to Gollum but this time it's a link so Gollum doesn't have to figure out what the actual phishing site is in the attachment there's the HTTP GET request are going to appear here their SSL information there the Whois information there so I can show you so there's the screenshot from PayPal luckily there's good internet so here's the geolocation or PayPal then it says PayPal target 99% probability so the model that we've got on the back end we've trained it on a few things PayPal is one of them we've

also trained it on Absa HTTP there's get information the SSL information and thing to mention you as the lexer rank which is not okay so not all of this data seems to have been pulled through that's okay so and the Alexa rank is usually shown in the SSL information and PayPal has got quite a hilux rank so what that is it's the Alexa top million which is the top million websites visited I think it's on an unrolling basis probably so Google was number one and people is in the top 100 I think oh yeah it's just like the it's supposed to have the values here yeah it's not doing that's okay and then why that's cool though is if

you see phishing sites that have a high lexer rank then I think they'll be taken more seriously because you know these sites are popular and they've been hacked and now someone's hosting fishing on them so we'll probably write like a rule that says if they're LexA rank is higher than 5,000 notes fires via telegram because then it's a fun story then the Whois data the registrant emails I spoke about that you could send that email notification saying your webs that's been hacked then extracted links so the HTML was fetched so that that's the HTML raw HTML content of PayPal and then we don't have fishing kits analysis yet or trying to find raw current credentials and open source threat

intelligence is blank now as well because PayPal is yeah not malicious so ok this is our fish board so we've got these are just examples so it's the example that I just uploaded now and we can click on details so this would list all of the fishing sites for the organisation you can search by it by target so the different companies that you've on-boarded would be searchable and date-time and filters and all that so

so he has the details for the specific fishing site so it was a shortfall in our previous iteration of this app is we didn't have a way to drill down into the fishing sites to look at more in more detail but there's the get request extracted links the Whois information is there so you can look at all of that in the platform as well and then something else as if this wasn't Absa or it wasn't being classified as absol AK if the probability here was much lower then you could go edit target and then select and which targets it was so these are all the companies i've on-boarded so the model would recognize any four of those

things and then you can tell it how how much how important is it that the model detects this thing is being that brand the next time what this is going to do it goes and screenshots this page then replicates it and puts it in the data set with all the other data for this for that specific target and then retrains the model so it makes it more accurate for the next time it's classifying that image then we've got statistics so total sites for December all of these little cards at the top here just show likes details for the day or for the month and yeah they've got geolocation of where all our fishing sites are we've got

sites per month as the years gone on so this is actual Absa data so you can see we receive a hell of a lot of fishing then this is fishing kits attribution so how what percentages do we think you know if there's for fishing kids targeting us then 30% is this fishing kit and we just take the last file of the fishing kits and you know that becomes like the name for the fishing kit I think that's the demeanor okay so I'm gonna go back to the slides now

yeah we don't have to sacrifice Ilyn okay some statistics so I find these really interesting so 2.3 percent of all of the fishing that we see is reported from Gmail users and that's tiny compared to the number of people that use Gmail so we think two things are contributing to this the one is there's no data set for Gmail users that belongs to South Africans so if that existed I'm sure there'd be targeted a lot more and there might be that data said I don't know I'm just guessing this not and I think that Gmail's got pretty decent security and the second thing is Google Safe Browsing blocks phishing sites and we know that they do the same

for email so as soon as we report a phishing sites fishtank 30 minutes later whenever Gmail comes Oh Google consumes that data they block it by email as well so I I think fraudsters just have a hard time targeting Gmail users so they don't you know we don't see a lot of Gmail users infected so Karen using Gmail if you are and then it's this old pages so you know a lot of people think like SSL is safe which I think hopefully no one in the room things like oh yeah 100% safe because SSL but we have seen so 32 percent of our phishing pages have an SSL certificate installed on them and this is because it's just a hacked

website so if the web site that gets hacked has got a HTTPS or SSL certs installed then the phishing sites also gonna have an SSL cert and this grew from 16% in 2018 which is like very interesting because browsers have been mentioning that they're going to stop supporting HTTP and then 74 percent of our email reports come from these four people so if anyone works here fix your email security no but I don't think they to blame actually I think that it's very easy for fraudsters to say okay it's all con este dotnet that's the South African market so if we find any emails with that you're all you know send a phishing emails and it's the same with all of these

these are very specific to South Africa and I think that's why we see them targeted not always because they're filtering is not good get in future work that we want to do so we want to automate the analysis of phishing kits so get out where where's the fishing kit posting the information to what emails are being used in the fishing kit then we want to do credential hunting so when the fishing kit can't post or email the credentials out then what it often does is just writes the credentials to a txt file on the server or to a database file and then we can go and find that file and then alert our customers who have been

fished so absurd does this manually at the moment but we want to automate that and then we want to automate emails to the owners of websites like I mentioned earlier from the Whois information and these these last three are basically so the same integration we want to beef that up to make queries to your seem to say have you seen any of these phishing sites and we want to also do proxy integrations so we we integrated in our proxy but we want to support more proxies so we block phishing immediately on our proxy as we know about it and then we want to feed our customers so we have a free AV to all our customers we

want to feed our fishing data to that AV as well and then we want to build a central sharing platform so if many people start using this platform then we can all share the data together all of our proxies can have these phishing sites blocked we can do em like attribution across industries we can check our post websites the same email addresses we could do coordinates or takedowns we just like start working better together and maybe even get ice peas behind us to block phishing sites like at a South African level that would be cool and then this is all being open sourced so I currently don't have an open source link at the moment because

we need to do due diligence our risk officer was one of the pen tests done which I agree with and so that we haven't had time to complete that which I would have liked but and so in January I suspect the end of January will be open sourcing this but for in the meantime you can contact me on Twitter or in person afterwards if you really like this we're really there's still a lot of work to do to get this over the line so if anyone in the room wants to contribute to this application that would be really great and also if you're interested in the feeds that we get off to this so if you

have a seam that you look after you can get a feed from us and we'll happily share that feed with you you just need to contact me so thank you very much I just want to mention the contributors so Benny over here who's been working a hell of a lot on this in the last few weeks and made the website look so pretty otherwise it would look terrible and then Justin Gifford who's actually a volunteer at besides he also works at Absa and hopes out on the early iteration of the application and then Nick Manila who is our analyst who works a lot with fishing at Absa he contributes a lot in conversation and a

lot of ideas and does all of the fishing kit analysis and hunting for credentials then I'd like to thank besides Cape Town and the sponsors for such a cool event thank you no cable and then you can follow me on Twitter and I don't know if anyone's followed me yet so thank you in person lots of people who choosers okay that's very interesting yeah what is I don't know what the browser does if the website tells it 500 but it returns content does the browser fine with that yeah that's bad news because like the one of the big things that we use to filter out how many sites we have to go visit is the HTTP status

but that's interesting you know it's a down bin yeah yeah thanks that'll we'll check that out yeah we've got an issue with smishing at the moment and so it's it's much more difficult to report smishing as a client because you've got to copy that into an email you can't just click forward so we want to bring an easier way to for clients to report smishing to us and that's coming in the future where we have a mobile number that you can just forward the swishing -

it does yeah we haven't seen

yeah you can give it different resolutions yeah if we see yeah because that would miss with the classification well say if we miss classify anything it goes to our analyst mailbox if they see you fishing there they they know that the bot would fail to detect us as fishing and then they notify us so if we see that's how a lot of the improvements have been made on the system yeah so that would be one of them if we saw because of the resolution of the screen we will miss classifying it then we would change our resolution okay I think so I've yeah briefly not icon non detail yeah yeah so they but both those things

are controversial fraudsters leave their backdoors up sometimes we log you know I don't want to say we log into them but it's it's I think like lawfully you can't log into that backdoor so then we would also be hacking someone's websites if we threw that content off so we don't do that yeah and also spamming them by email I don't know we also we worried about like the sacrum no at the end of the day and you you are irritating him a lot if you do that and you're just doing your job if you do what we doing so yeah have you reverbs things specifically have you guys with some more yes a hell of a lot yeah so when we started the

referrer thing we were getting almost a hundred percent of our fishing mail through the referrers and now we get maybe 70% through a firs and a very nice fishing kit in my opinion but it's also easy for us to detect just a base64 encoded image with the edits over the weather credential should be and one post button that posts because that's loading no contents it's just a picture with some edits

you you you can do that yeah so it would be a nice addition on to it our colleagues report phishing either to our internal mailbox that we've got set up for phishing or to our external one and then if we see something is from an at our domain address then we respond to that differently because we know this as colleague but that so our model is very different to the initial onboarding model because we've been doing this for a year so we've got a lot of screenshots of different fishing so our model is a lot more robust then an initial model for someone would be so like for example office 365 phishing that page changes

quite regularly and you've got to update that model regularly which now the platform deals with really well but it didn't in the past so I think it would be a nice use case [Applause]

Gollum: One Anti-Phish Bot to Rule Them All

Related talks