Researchers Unmask Anonymous Twitter Accounts With 97% Accuracy Using Machine Learning

As many learned for the first time earlier this year when popular outrage forced Facebook and Google to publicly reveal just how much valuable personal data they harvest from their users, tech companies know almost everything about us, including the establishments we frequent, the stuff we buy and the people we know. And in the latest example of just how much detail is unknowingly embedded in our social media profiles, researchers at University College London and the Alan Turing Institute have demonstrated that they can identify a twitter user with a staggering 96.7% accuracy using only their tweets and publicly available metadata run through a machine-learning algorithm.


For users who occasionally engage in anonymous tweeting, this revelation shouldn't go unacknowledged. In their study, the researchers discovered that their most basic algorithm could correctly identify an individual user in a group of 10,000 using just 14 pieces of metadata from their posts on twitter nearly 96.7% of the time. Furthermore, attempts to obscure the individuals' identity by tampering with the data were remarkably ineffective: Researchers found that they could still identify users with 95%+ accuracy when 60% of their metadata had been tampered with. When researchers broadened their scope to the 10 most likely candidates, the algorithm's accuracy rose to 99.2%. A single tweet reportedly contains 144 fields of metadata, according to RT.

"That’s the mentality with metadata," the study’s lead co-author Beatrice Perez of University College London told Wired. "People think it’s not a big deal."

The study's findings have major implications for data privacy, as the researchers explain in their introduction:

Previous work shows that the content of a message posted on an OSN platform reveals a wealth of information about its author. Through text analysis, it is possible to derive age, gender, and political orientation of individuals (Rao et al. 2010); the general mood of groups (Bollen, Mao, and Pepe 2011) and the mood of individuals (Tang et al. 2012). Image analysis reveals, for example, the place a photo was taken (Hays and Efros 2008), the place of residence of the photographer (Jahanbakhsh, King, and Shoja 2012), or even the relationship status of two individuals (Shoshitaishvili, Kruegel, and Vigna 2015). If we look at mobility data from location-based social networks, the check-in behavior of users can tell us their cultural background (Silva et al. 2014) or identify users uniquely in a crowd (Rossi and Musolesi 2014). Finally, even if an attacker only had access to anonymized datasets, by looking at the structure of the network someone may be able to re-identify users (Narayanan and Shmatikov 2009).

The study's goal was "to determine if the information contained in users' metadata is sufficient to fingerprint an account", and it showed that even rudimentary algorithms had high success rates when it came to correctly identifying users. During the study, the researchers used metadata like the date the account was created, its followers, the accounts it follows and the tweets it likes, and ran it through three different machine-learning algorithms. This method, according to RT, could be used to identify an account if a user changes its name, or creates multiple accounts - or to tell if a legitimate account has been hacked.


While the researchers used Twitter for their data, they warned that "the methods presented in this work are generic and can be applied to a variety of social media platforms."

Read the study below:

2018.07.09pdf by Zerohedge on Scribd







glenlloyd Four chan Tue, 07/10/2018 - 01:26 Permalink

Yes, that appears to be the general trend at the moment, if it's conservative or patriotic it must be fake.

Funny thing is that people are not buying that what the left constantly says is normal actually is the norm.

The left is perpetually trying to tell people that their opinion or what they say is what should be and in reality they represent such a minute segment of the population.

I think it's just time to dump the twit account, since I don't follow it anymore. And then there's FB, which I sign on to so irregularly and don't use...could be and should be deleted too.

BabaLooey Skateboarder Tue, 07/10/2018 - 05:31 Permalink

Trump...and many others...use Twitter correctly....

It's Trump Unfiltered........damn near daily....

However MOST Twatterers...are simply fucking ASININE in their usage of this....utility....(which it bloody IS one)

Instant...thought for the masses.

Creeps me the FUCK out that one can "re-tweet" others immediate stupidly brilliant....

Out yourself as a lunatic/stupid/piss off people you don't know...alienate people...

How novel....

How completely devoid of intelligence....

Scanderbeg Number 9 Mon, 07/09/2018 - 22:42 Permalink

Indeed, some shitlib from Indiana was arrested after making threats on /pol recently.

Bants are all fine and good but if you're going to threaten bodily harm they can easily find you. Hell, they can even pinpoint your movements with facial recognition software remotely now.

All of these platforms are run by converged companies so if you are generating enough attention you can easily be doxxed at any time and that problem will only get worse.

OverTheHedge Number 9 Tue, 07/10/2018 - 01:12 Permalink

Better let Harry lightning know, before he gets himself into real trouble.

Assume everything you do (especially your porn habit) is available to all. Let your daughters know that any digital photos are freely available to the entire internet, including their father, so act accordingly. Only post bland, mundane, non-reactionary things about kittens and puppiesz and you should be safe. "It costs a lot to live this free" - never a truer words spoken on Wayne's World.

Or join the underground, hide out in the woods, and take part in acts of random graffiti.

Umh Mon, 07/09/2018 - 22:20 Permalink

I am so shocked;) That being said well okay. I am not to concerned I assumed that the PTB had already tapped the line.

One of these i… headless blogger Tue, 07/10/2018 - 00:39 Permalink

navy62802 Mon, 07/09/2018 - 22:26 Permalink

A bunch of basement-dwelling freaks think they can write a computer program that will answer all of their questions about humanity and love and feeling and everything else. That's what has taken over our world. Meanwhile, reality doesn't give a flip and keeps humming along while these freaks watch from the sidelines like they always have.

MuffDiver69 Mon, 07/09/2018 - 22:38 Permalink

If I literaly didn’t work for myself i would have no social media footprint. These systems can and are used by some, and soon all employees to flag out anything....good luck getting a job soon enough if your match doesn’t tow the leftist company line...

roddy6667 Tue, 07/10/2018 - 00:04 Permalink

Back in the Usenet days I could find people using multiple accounts. A lot of people use clever (they think) spellings and phrases a lot. A search of the Usenet archives would often find them in seconds. A guy I knew online had this real world girlfriend that we all warned him about. I showed him where she was a regular in a lesbian chatroom. He married her anyway.

FreeEarCandy Tue, 07/10/2018 - 00:45 Permalink

After drinking 1/2 Scotch, I can review 7 billion people in 2 seconds and tell you with 95% accuracy that not one person on this entire planet gives a crap about being spied on. If I remove the ice cubes from my whiskey glass my accuracy goes up to 99.999%


zvzzt FreeEarCandy Tue, 07/10/2018 - 03:25 Permalink

"...entire planet gives a crap about being spied on".

Not yet. And then suddenly, "outrage".... 

The biggest thing of the past months was that whole Facebook-privacy-bollocks. The fact that they still exist after that is such a massive green light for the data industry to steal/sell/misuse/spy even more than ever. And to be honest, who can blame them...? 

HorseBuggy Tue, 07/10/2018 - 01:08 Permalink

Does anyone read the articles anymore or just the headlines?

"he researchers discovered that their most basic algorithm could correctly identify an individual user in a group of 10,000"


The world has a lot more people than 10000, let them test it with a million 

Felix da Kat Tue, 07/10/2018 - 01:46 Permalink

The era of social-media anonymity is over. If you voice an opinion, you are immediately and easily doxxed. This has the potential to become a leftist tool of silencing dissent from liberal narratives. Conservatives will be frightened to be in opposition. Going into "Hide-out" mode will be the norm. And it can be monetized as a dangerous political tool, too. Google has had you identified (doxxed) for years. Your IP address is linked with GPS technology. NSA uses the technology to track would-be terrorists. And I'm glad they do. But when dissent is silenced within the rank-and-file citizenry, then their means of recourse is taken away. This opens up or speeds-up resorting to other and possibly unsavory methods of dissent. This invasion of privacy will have an unintended consequence. Dissenters will adapt in new and different ways. When this fully evolves with mandatory and inescapable penalties assessed against those who offer views that oppose the liberal narrative, we will have entered a new era in the political landscape. It may become an era of guerilla warfare to get one's point across.  

Downtoolong Tue, 07/10/2018 - 07:52 Permalink

If you think this is scary, consider the next step. If Artificial Intelligence can identify you from your behavior, it can also accurately mimic and act as if it was you. The moral and ethical hazards here are unbounded.

Dragon HAwk Tue, 07/10/2018 - 08:42 Permalink

Ok so at least they know how pissed we are, and the list of people to watch has got to be huge, more chaff for the radar.

  at least the Amish are relatively safe.  Me i am just biding my time till the Fema Camp.

shame they can't seem to catch any Banker or Politician crooks,   guess their Metadata isn't that good, you know, wink wink