
okay they closed the door so i'm going to take that as my cue to get going here thanks for showing up this is uh my first time here at besides charm i'm really excited to be here my name is andrew uh i'm a professional coffee drinker over at the dod um i'm really excited to be here to talk about something that a project i've sort of worked on for a little while um something that's near and dear to me can we move this away uh the election security problem which has kind of been a a personal and a professional interest for for some time of mine um i've been fortunate to work on sort of that problem over on the federal side
and i was excited to come here to besides charm to talk about that too and talk about it in the context of uh of the conference here when we're talking about election security so um the small part of that problem that i wanted to look at was how you know election security which is a big overarching problem set how that was impacted and affected by something that i think we're all unfortunately individually impacted by which has been data breaches um so that's just kind of the small little aspect of it i wanted to look at this slide's kind of rhetorical because i think we've all sort of the last two days heard about you know what's going
on and why this is such a big problem and why political parties especially state and local ones elections and campaigns why they're so vulnerable and why they're so interesting for adversaries to target so you know the part of it i wanted to look at you know looking at data breaches and looking at state and local parties is because i wanted to look at something that was probably more easily identifiable with as a you know at the individual level because state and local parties are how we you know participate in democracy it's how we vote it's how we meet our party and it's how we meet our elected officials so it impacts us much more personally and i think directly than
maybe some of the federal national level ones do um so i wanted to start looking at you know see what what did this ecosystem look like and um you know what part of this could i take a look at um so i started looking around at you know some some party websites this is the the michigan state republican party website and i started to notice something on the party websites this is michigan's right now they provide a lot of contact information for local officials like district and county level chairs sometimes they're party staff uh here's the vermont state democratic party website uh they did the same thing i was curious if this was just a one-off
thing i didn't understand the time why organizations were doing this so i you know i kept looking this is the alaska state democratic party website um they actually had an attachment for me you could just grab emails addresses names stuff like that um here's the illinois state republican party theirs was already in a spreadsheet which was handy for me but i kept seeing this over and over and over again here's the dc democratic party website uh you said it's a little tough to read there at the back it's a full list of dc democratic party committee members and their contact information addresses personal emails phone numbers names i didn't understand why organizations were doing this
but they all kept doing it personal contact information personal emails that were publicly listed on their website this is the massachusetts state democratic party website you can see there at the bottom it's a little tough to read there but it's one to ten of 557 entries i didn't understand why this was happening at this scale but i realized that i'd found something that i wanted to look into so understanding the scale that you know these parties work at that there are hundreds of state parties out there there are thousands more of local level and district level parties under them um the scale here is is significant they're they're massive massive organizations and data breaches you know
tend to also be pretty massive in in comparison as well and you know unfortunately as someone who's you know been breached before too um it's a it continues to be a problem it's ubiquitous in the industry but the reason i thought this was interesting was because you know these organizations were providing private emails um just in some cases volunteers in some cases party staff local and county and district level chairs this was interesting to me i wanted to know why they were doing this and then i wanted to know what the risk to those organizations was what did it mean that you had you know hundreds of emails on your website what would that mean for your your risk profile what
does that mean for um you know how secure someone's gonna think your organization is um it's particularly interesting for emails too just because the way we use emails today it's the the trusted key at the bedrock of the online authentication scheme we use them to log into accounts we use them as usernames for other accounts as two-factor auth for all kinds of accounts and services it's a really key part of our kind of online schema and i wanted to know what it meant that the organizations were doing this so my problem statement that i wanted to go after was you know how to quantify empirically the level of risk for state and local political parties
caused by account exposure and data breaches at scale and i realized pretty quickly you know you may have realized as well looking at the first couple slides i showed you that this was a this is a data problem first before anything else and so i needed something that was going to help me understand and tabulate that data so i put together this little tool called hook shot it lives on my public github there at the bottom it's very very simple and it comes with absolutely no warranty or guarantee of working but um what i wanted to do first was to go out and set you know with a deck of urls pull back every email on those urls
i used cool to do this it's a it's a tool in the default kali linux distro it's a great open source web scripting tool i ran that in parallel within the program so you can go out and scrape dozens of these at the same time that you know to start gives you some interesting analytics because you know one it's interesting in terms of magnitude how many emails are on this website it's also interesting in terms of the type of emails you get back whether it's private emails gmail's yahoos and hotmails or whether it's like a dot gov or a dot org or in some cases like a dot gop or dot dem so just the type of data you get back
just from what you've scraped off the website can be really interesting then the other the second half of this tool was taking all those emails you know by url and then throwing them against the have i been phoned api which is a super flexible low cost super easy api you can query against the service for an account and it'll to return back any data breaches that account has been in then you can query on the data breaches as a selector as well and it'll tell you the type of data that showed up on that breach password financial information addresses super easy to use api for the low low cost of like three dollars a month
um so that's what the tool did kind of the first two parts of it and then the third aspect was putting all that together and some kind of analytics like what okay what does it mean to have x amount of emails all this out there when i started i wanted to run this against all 50 states so i started with 100 websites which is all the state democratic and republican party websites and the results i got back were a bit broader than i expected um out of those 100 urls 92 of them listed personal emails and on those 92 websites i got back just shy of 10 000 private email addresses um a bit bigger than i was expecting
going into it um it's like an average of 100 110 per website um which to me told me that this was much much bigger than just those couple websites i started looking at the start this was systemic um of those 9900 email addresses um about two-thirds just over 6000 had been in recent data breaches i don't think the organizations are aware of that and i'll talk a little bit more about that later like why i think this is a situational awareness problem but that was you know to me that pointed to a substantial substantial amount of account risk out there um i'll talk a little bit more about the like the specific analytics that i built
into the tool but one of them that i i pulled out that i thought was particularly interesting was i sorted for corporate accounts.gov.gop.org and i got 363 of those that has showed up in data breaches on those websites so mostly personal email addresses a lot of gmail's and yahoos but they're also breech corporate accounts which i think to me was interesting to look at at the individual url level to drill down to that and say hey you know why does this particular state have this many like breached corporate accounts so you can really pull back some interesting sort of just quick level looks at these organizations just based on that here's a couple individual results up at
the top left there that's a northeastern state democratic party website with 989 email accounts on their website 740 or so had been in data breaches that was the worst offender of the group kind of an extreme example i know it might be a little tough to read in the back there the one in the middle though was a fairly typical result this was a midwestern state republican party website uh with about 50 emails about two-thirds of them had been breached and then when i ran this against an ex an expanded list of urls when i included third parties about 195 state level green libertarian and constitution party websites i got back the one on the top right
which is a west coast state green party website fewer accounts on the website but generally the same sort of levels of exposure which to me again kind of pointed to the fact this was systemic this wasn't someone doing something wrong for one state some of the specific account you know account analytics there at the bottom i wanted to put my finger on okay you know what did it mean to have 989 emails out there what does it actually mean if 750 of them are breached so to start with you know the first kind of easy analytic to put in you know in someone's face is okay you've got six thousand breached accounts here um those
accounts are ten times as likely to be fished targeted by phishing at least to start if for no other reason the fact that your information is now out there and you've got personal information exposed someone can go scrape your email address off of a paste and send you an email if you're crafting sticky like personalized phishing content those phishing messages are three or four times as likely to be successful i don't even need your current password to do that it's just based off of what you can put into there from the data exposed in a data breach so that right there was i think an interesting analytic for an organization to say yeah i've got this
many accounts who are now more susceptible to being phished um then i kind of went a little deeper and there was a really cool study out of virginia tech a couple years ago that looked at accounts and data breaches and tried to match passwords between the same username found that about 40 percent of users were reusing passwords um so if you start with you know 6000 reached accounts you've got 2600 or so potential um instances of credential reuse where someone's reusing a password or username between services that to me is is going to be automatically interesting as you know an organization where yeah someone's personal account has been exposed but now potentially you put you know login mechanisms or
authentication at risk from my organization's domain because you've reused accounts and passwords then you know going down kind of even further than that there was a really really cool study out of google that looked at breach gmail accounts and they found that between seven to fifteen percent of breached emails included the password currently in use for that account so if you use that as a conservative lower estimate you've got around 400 or so credentials that have been exposed that are currently in use for that account and this is just because you got unlucky that somebody got owned and your data's out there if you put that together and you say okay you know these users reuse
passwords these users just got unlucky and have their current password exposed you've got around 180 so instances where you could potentially have someone who's reused a password that showed up in a data breach that could be a valid login to these websites if you remember we started with only 100 urls and you got maybe around 180 or so it's not a guarantee but that to me would be sort of concerning if i've got 100 email addresses on my website how many of them you know belong to folks who've got a log into my website if they're local volunteers or district or kind of precinct level officials that would be concerning to me um does anybody want to venture a guess
who did it worse or who did it better i'm willing to trade swag if someone gets this right who said that okay get me after i'll i'll get you this but that's exactly right your average democratic state party uh website had 100 private accounts on about 64 of them have been breached your average republican party website a little bit more accounts but the exact same it was even pretty much the same for third party websites fewer accounts on them but almost the exact same level of exposure that to me what that tells me this isn't it's not a partisan problem it's not a regional problem it's not a state problem this is a a systemic one this is
a people and culture and it started helping me understand why these organizations were doing this you know it's the core function of these organizations to be publicly accessible to interface with the public to introduce them to their elected officials um i don't think that you know the fixes that you know you can suggest i don't think it's just effective to take all these emails off it's you know if the party's not reachable not accessible it defeats the whole point of why the website exists in the first place so i think when we talk about mitigations and like what this means here in a second i think that's an important thing to think about because yeah i mean you can put in technical
fixes but the core mission of the organization like we heard about yesterday like for a campaign to win an election it's not going to matter if they're not reachable they have to be reachable and responsible to constituents so that to me i thought was interesting because it's this is not a one organization has done it bad it's everybody seems to be doing this and i think i thought it seemed anomalous at the time another analytic i wanted to look at was you know has something happened related to a website recently with a state level political party that i could potentially look at and say hey was you know was there something predictive about the data i had
does anybody remember when this happened last fall so this is the aftermath of the texas state republican party website after anonymous took it down in september and threw a bunch of pokemon up there um so site went down for about two days until the admins could get uh get control again they took it down for like two weeks put in a whole bunch of technical controls technical mitigations brought the site back up i realized though that when i first started working on this project i had started before this happened i had actually gone and scraped the texas gop website about a month before this happened and so i wanted to see if like the data
i had was this indicative of all i'm not saying it's necessarily how it happened but about anything that i would have seen on that data would i have thought that hey maybe this organization is more susceptible to something similar like just like a website takeover which i would consider pretty low sophistication low equity a month before the hack those are the numbers i got back i'm not saying this is necessarily how anonymous got into the website but if you've got 236 breached email accounts that are publicly listed on your website the odds are pretty good that you've got around 15 or so valid credentials out there if someone's got a log in that website and all i need is a web browser
and an email account to get them i don't even have to fish i don't have to do anything like that so it may not be how it happened but to me you know when i look back and i look at that website then that to me looks like an ample attack surface whether i'm you know targeting folks with phishing whether i'm crafting particular specific content or if i'm just targeting your organization at large that gives me enormous base to start from so six months later right you would imagine that things have been fixed the website went down for two weeks they renegotiate things with their service provider a lot changed this was a tremendously public impactful damaging
thing for the texas state republican party both in terms of like literal cost and then also reputational cost and i know they spent a long time looking at this website and trying to fix things and prevent this from happening again so i figured that things might be a little different now um this is the current texas republican party website as of last week i saved the page where they have some emails on them when i ran this last week and i scraped the website uh 342 private email accounts 322 had been breached so fewer emails yes but if if i look at this just from the lens of my tool and i look at the before and after
i see more exposure and more risk to the organization now than before the implemented controls and mitigations and i think that gets back to like hey like what's what's the problem here why are they doing this i think there was you know enormous amount of resources put into technical controls technical fixes technical mitigations i don't think anyone stopped to think about what was the organization's mission how do we need to refactor that or how can we integrate security with that to build a more defensible secure organization i don't think they've looked at this because to me like if i see this i think the website is in a worse risk posture now than it was before
but i think no one you know looked into the people on the culture and why we need to integrate with that um do we by any chance happen to have the maryland ciso in the room that would have been so lovely um i couldn't come to you know maryland and b-side start with having like a local homegrown example as well so here's the maryland republican and democratic state websites um when i ran this tool last week i'm not trying to finger point at maryland i'm not trying to finger point at texas um remember this is everybody this is 92 out of 100 state parties have done this but i i think it just goes to
show you that i don't think the organs it's a situational awareness problem i don't think they're aware of what this risk pretends for the organization what it means to provide this level of contact information on affiliates officials employees and volunteers and what that could mean for that organization in terms of risk for phishing risk for just me logging into your website with a credential i pulled off a paste bin um it's not any one person's fault on any one state's fault some had a lot more than other ones but um i think it's it's sort of a systemic thing so where you know where i want to go from here um if you remember back to the
alaska state republic i think it was the democratic party website i showed you earlier uh that was one when i actually ran this tool last week and i went back to go look at that attachment they had scrubbed out all the email addresses i don't know if it's because they saw my iep cutting and their web server logs a whole bunch it's possible i'm not going to take full credit for that but at least tells me that maybe somebody else thought that this was maybe a bad idea i haven't been contacted by anybody else or anything like that but i think it just goes back to understanding the mission understanding the culture and then understanding
having a number and figure to put in front of decision makers to say hey here this is why this is a risk for you as a reminder like the tool can only pull back what it can see it can't grab off those attachments in many cases so this is probably a pretty significant underestimate i didn't go to any local or county level party websites either i think the numbers would just go up because i know state level and local level the funding cycles are unpredictable and a bit anemic um so this is probably an underestimate but really i think it gets back to the issue at hand here this is a at its core it's a situational awareness problem
it's a people problem it's not just a technical problem there's a lot of other things i'd like to look at with this tool um i'd like to go take a look at some campaigns go i'd love to go look at the local and county levels i had someone suggest looking at school districts as well which tend to be organized somewhat in the same hierarchical fashion between state and counties as political parties are um i think it you know what it helps offer is a quick snapshot on a website hey what data is out there that maybe you're not aware of and what does that risk mean for your organization i think there's plenty to improve on in
terms of the tool as i said my code comes with with no warranty or guarantee of success but a lot to improve on there and i think you know for owners of those domains there's a lot you could improve on using that as well um at this point that sort of wraps it up i'm happy to answer any questions anyone's got any but i'm happy to stick around after please don't leave without me giving you a patch or something here yes
yeah um i'd i'd love to i was you know one of the reasons could have come in here and coming to stuff like this is to help try to get the awareness out there um i haven't you know heard from anybody and i i think it's kind of like we heard in the keynote yesterday like that squishy sort of perimeter like i think a lot of those decision makers are going to say hey we have to be publicly reachable there are websites and there are parties who do this like well there are ones who have contact forms and it just goes to that specific person like the fixes are pretty low sophistication low cost um i think it's just a situational
awareness problem so in most cases honestly they've got ways of contacting folks to like volunteer with the party but you know you're given state party website doesn't have hey contact me with your vulnerabilities and exposures um so yeah i've tried to get the word out i tried to kind of meet and network with folks but if you know anyone i'd love to love to talk with them
i honestly i really wish i could tell you i thought one of the one of the most interesting things that i you know i found with the with all this was that um if you do like a you know user at texasgop.org those guys are showing up in data breaches too um it's not i mean even the websites who use their own domains like for their typically for like state party staff they'll be on sort of a corporate domain those ones are showing up on data breaches too i don't it's it's less about i think listing it you know listing the type of email as it is just providing it on the website i don't
see any other organizations that do this it's not i mean in government or in academia like you usually don't list all everyone's contact information whether or not it's your own domain i don't know i it was anomalous and i thought it was odd and i wanted to understand why
anyone's got any questions okay i'm happy to stick around and chat after i really appreciate you guys coming by it's been it's been fun to talk to you guys about this thank you [Applause]