AI System Detects Posts By Foreign 'Trolls' On Facebook and Twitter (theguardian.com) 62
An anonymous reader quotes a report from The Guardian: Foreign manipulation campaigns on social media can be spotted by looking at clues in the timing and length of posts and the URLs they contain, researchers have found. Now researchers say they have developed an automated machine learning system -- a type of artificial intelligence -- that can spot such posts, based on their content. Writing in the journal Science Advances, the team report how they carried out their work using posts from four known social media campaigns that targeted the U.S., attributed to China, Russia and Venezuela.
After training the system on a subset of the data, the team explored five different questions. These included whether the machine learning system could tell apart posts from trolls and those linked to normal activity, and whether feeding the system with troll posts from one month would allow it to spot posts made during the following month by new troll accounts. The results show that the approach worked well, with the posts flagged by the system generally coming from trolls. However, not all troll posts were identified by the system. The team also found differences in the system's performance depending on the country behind the campaign, with Chinese activity easier to spot than Russian activity. "In terms of the Venezuelan campaigns, our performance is near-perfect; close to 99% accurate," said one of the researchers. "In terms of the Chinese one, our performance is around 90%, 91%. The Russian was the most complicated and most sophisticated campaign: our performance was around 85%."
After training the system on a subset of the data, the team explored five different questions. These included whether the machine learning system could tell apart posts from trolls and those linked to normal activity, and whether feeding the system with troll posts from one month would allow it to spot posts made during the following month by new troll accounts. The results show that the approach worked well, with the posts flagged by the system generally coming from trolls. However, not all troll posts were identified by the system. The team also found differences in the system's performance depending on the country behind the campaign, with Chinese activity easier to spot than Russian activity. "In terms of the Venezuelan campaigns, our performance is near-perfect; close to 99% accurate," said one of the researchers. "In terms of the Chinese one, our performance is around 90%, 91%. The Russian was the most complicated and most sophisticated campaign: our performance was around 85%."
Is it fair to call it an AI? No. (Score:2)
yawn.
Re: (Score:2)
Slashdot can't figure out how to filter ascii swastikas or GNAA spam.
Or identify duplicate posts before they're posted.
Let's face it... (Score:1)
There are people who mimic the bots....
Re: (Score:2)
Re: (Score:1)
AI (statistical classification) doesn't know anything, it only knows what it is being fed. So if you feed it conservative media, it will indicate that as trolls. If you feed it CNN articles, Facebook would implode.
Re: (Score:1)
It's easy (Score:4, Funny)
Re:It's easy (Score:5, Interesting)
It's easy for AI to detect posts by non-Americans. The spelling and grammar are correct.
It depends. Typically, American spelling/grammar mistakes involve homophone mistakes, absence of apostrophes in contractions, run-on and run-together sentences, and the absence of capitalization or periods.
Individuals for whom Russian is their first language tend to have different grammatical mistakes. I commonly see issues with articles and other small adjectives, as well as present-tense verb conjugations in place of present-progressive being common - "I go to store" vs. "I am going to the store", or "I have date tonight" vs. "I have a date tonight". Caps and punctuation are almost always perfect, but those tend to be giveaways for Russian-native speakers.
Mandarin/Cantonese has its own dead giveaway. English has an actual adjective order: Quantity, Quality or opinion, Size, Age, Shape, Color, Proper adjective, Purpose. It's why my description of a green, fast, decrepit, small, six Maseratis sounds far more weird than six small, fast, decrepit green Maseratis. Mandarin/Cantonese speakers who use multiple adjectives tend to have issues with ordering them properly when speaking in English.
My lady friend who speaks Spanish but is learning English only recently has issues using the word "very" instead of "so" - she'll say "My long day at work made me so tired", instead of "My long day at work made me very tired".
Different cultures seem to have issues with different aspects of English, and while I do have concerns about Twitter using AI against 'Foreign Trolls' (what about the foreign trolls whose views they like?), I would submit that it's not unreasonable to believe that an AI could be trained to distinguish between foreign English speaker and an American who couldn't pass that 4th grade English test.
Re: (Score:2)
I would submit that it's not unreasonable to believe that an AI could be trained to distinguish between foreign English speaker and an American who couldn't pass that 4th grade English test.
Or could be trained to reliably differentiate between a foreign English speaker and an immigrant (14% of US population).
Re: (Score:2)
Re: (Score:2)
My lady friend who speaks Spanish but is learning English only recently has issues using the word "very" instead of "so" - she'll say "My long day at work made me so tired", instead of "My long day at work made me very tired".>/quote> I'm a native English speaker and both of those sound natural to me.
Re: (Score:3)
Most of these are really good examples. I do want to take issue with one of them.
My lady friend who speaks Spanish but is learning English only recently has issues using the word "very" instead of "so" - she'll say "My long day at work made me so tired", instead of "My long day at work made me very tired".
I'm a native English speaker and both of those sound natural to me.
I was having a bit of trouble coming up with a particularly good example - "so" and "very", in most cases, are indeed interchangeable and grammatically correct. it was more of a 'grammatically correct, but nobody says it that way' vibe.
The way I described it to her is that statements which use "so" are most commonly tied to a "that" statement. While "my long day at work made me so tired" is correct, I'd expect "very" to be used unless it was followed up. For example, "my long day at work made me so tired th
Re: (Score:2)
1) If I am sitting at a desk with a regular keyboard and computer.
2) If I am using a phone or tablet.
3) Sitting down on in front of the TV using the laptop and typeing one handed.
Re: (Score:1)
Exactly,, if you see someone talking about their "donkey pet" instead of "pet donkey" then you know they are not a native speaker. Both are legal in english but one isn't common usage.
I think you can combine this with the I.P. address and also identify VPN's.
And you also need to combine this with human beings, including legwork on the ground.
If you identify a possible antagonistic social media agent posting from a known I.P. address, then you need to send federal security agents to check it out.
And it woul
Re: (Score:2)
Exactly,, if you see someone talking about their "donkey pet" instead of "pet donkey" then you know they are not a native speaker..
Actually, both are perfectly good English although one is certainly more common.
"My pet donkey" informs you that I have a donkey that is a pet.
"My donkey pet" tells you that I have a pet, which happens to be a donkey. (In this case the fact of the pet being a donkey, rather than any other type of animal, is stressed).
And you'd be astonished at some of the strings that alleged native English speakers can produce. One of my favourites is the programmer's code comment "Horse string length into correctitude".
An
Re: (Score:2)
Buffalo buffalo buffalo Buffalo buffalo Buffalo buffalo buffalo.
"Your comment violated the "postercomment" compression filter. Try less whitespace and/or less repetition."
Perfectly valid and meaningful English sentence that's blocked, but ASCII swastikas aren't...
Re: (Score:1)
I've read close to 10,000 books.
I've discussed various subjects on usenet and then the internet, youtube forums, slashdot, etc.
The first and *ONLY* time I've ever see "donkey pet" instead of "pet donkey" was coming out of a reverse chinese translation while writing a localization file for chinese in google.
As I said, it's legal. On the other hand, it would be damned suspicious to see a purported "native" speaker using donkey pet.
In fact when you *specify* it in a google search there isn't a single success
Re: (Score:3, Insightful)
I guess when even stupid AI can detect conservative troll posts that easily they deserve to be censored. You're focusing on detection and censorship. You should probably work on the troll part first.
Re: (Score:1)
All the so called AI will do, is look for matching comments within a range of patterns. So comments that are alike, many of course being exactly the same, under other names in other forums, and looking for that pattern repeated by those names and when it is, voila troll. The AI part, for other than identical posts, a few words changed or the sentences rearranged. Just pattern matching.
Re: (Score:3)
It sounds like it's just Bayesian filtering, not really AI. Similar to spam filtering, just weighting a few factors like
- Does the account only post during Moscow office hours?
- Does metadata on the photos say they were taken in St. Petersburg?
- Is the IP address Russian?
- Does it re-post (with slight changes) stuff from other foreign troll account?
- Does it prefer graphic memes (which are harder to scan than text)?
- Was the profile picture stolen from somewhere?
- Do they always post from Firefox on desktop
Can't have it both ways (Score:2)
I am skeptical AI could reliably detect something like that. It is more likely that AI will be used as a pretext to ratchet up censorship of conservative opinions...
You can't have it both ways. Either the ML algorithms can reliably understand content and block it or they cannot. If they cannot reliably identify trolls then they probably can't also identify conservative opinions.
My bigger concern though is that these algorithms do work. If that is the case then we have just created the best tool ever made for mass censorship. I doubt that this will end up being used responsibly.
Re: (Score:2)
I am skeptical AI could reliably detect something like that. It is more likely that AI will be used as a pretext to ratchet up censorship of conservative opinions...
You can't have it both ways. Either the ML algorithms can reliably understand content and block it or they cannot. If they cannot reliably identify trolls then they probably can't also identify conservative opinions.
I did not suggest that censorship would be automatic or AI-driven. Humans can understand context fairly well, do the censorship and then blame the result on rogue AI.
Re: (Score:2)
This is exactly the same criticism levied against the use of AI in law enforcement. In that case, the criticism typically raised by liberals concerned with bias based on race or social status.
It is true that our best AIs are typically neural net / deep learning, which are largely controlled by their training data. I believe an unbiased or low-bias AI is possible, but we would need to control the quality of the training data in a systematic way.
As far as I know, no systematic approaches have been developed f
Re: (Score:2)
I am skeptical AI could reliably detect something like that. It is more likely that AI will be used as a pretext to ratchet up censorship of conservative opinions without them having to openly admit to it. This way when they are called out on politically-motivated censorship they will be able to blame AI.
I also am skeptical, and fully expect that it will consider me to be AI based on whatever faulty logic they are using.
Let's be honest here. By now we should all be familiar with just how badly most software is written, and only a fool would think this could be accurate.
Re: (Score:3)
It is more likely that AI will be used as a pretext to ratchet up censorship of conservative opinions without them having to openly admit to it.
Of course, part of the problem definition is 'remove false information' and most 'conservative opinions' are based on false (or non-existent) information.
Pray that Covid away!
Covid is just a libtard hoax!
BLM is a terrorist organization!
Sorry, none of these should pass a basic test, although they do seem to all be mainstream-Conservative opinions.
How about... (Score:2)
But can it detect posts by homegrown morons?
"Social networks" are faulty in principle, because they break through locality of communication, and remove social constraints on any one person's speech, while giving arbitrary individuals unlimited reach. This is a few levels above "freedom of speech" - this is amplification, and it benefits loud-mouth morons disproportionately. There is no reason for an arbitrary person to be explicitly enabled to reach thousands (or millions) of others. Everything else is just
Re: Arbitrary Person (Score:2)
And by “arbitrary person” you presumably mean someone of whom you do not approve.
Detect this (Score:1)
Chicken and egg problem (Score:2)
How do they know even which messages are state campaigns rather than real people, to begin with in the initial data they are using to train the AI? Since that can be difficult to determine to begin with, it means the whole thing is suspect. Its a sort of chicken and egg problem.
Re: (Score:2)
You may not know by an individual message, but if you take the, dare I use the term, holistic approach — tracking the accounts posting history and their "likes" — you may gain valuable insights.
For example, an account, which yesterday cried about British NHS being underfunded, while attacking Obamacare as Socialist today, in all likelihood is not concerned for either. And when it then attacks British help to Ukraine,
Obvious (Score:4, Interesting)
The reason the AI probably performed better on one over the other is probably entirely due to the language. The Chinese bot posts were almost certainly roboticly created, as nearly all other Chinese "hacking" nonsense tends to be as little-effort as possible, and contextly vapid (basically only ever defending Chinese interests, and offering zero insight.) Russian bots however actually pretend to engage with english-speaking posters, often parrot articles from the crappiest of misinformation sources like extreme right-wing sites, and are more about sewing divisive chaos (like nearly all 5G causes cancer, anti-mask, and anti-vaxx type of content originates on one of these crappy right-wing sites, and then is massaged into social media posts via bots.)
Like it's almost a certainty that you're dealing with a bot on Twitter if the language they use is repetitive. Like just follow @AOC on twitter to see the worst bots and trolls and people who don't know they're dealing with trolls in the replies. Similar things happen with other celebrities and politicians, but AOC's is a magnet for one specific kind of right-wing trollbot and some left-wing ones too. The pumpkin-in-chief on the other hand just has a lot of people insulting the twitter handle, almost as if mentioning his name is a synonym for idiot. I'm certain there's a lot of bot activity there too, but it's probably harder to tell from people genuinely just insulting him.
Re: (Score:2)
Re: (Score:1)
Re: (Score:1)
Re: (Score:1)
I love it when the Trump is a Nazi bots fight with the Trump is a tool of the jews bots.
Won't work. (Score:1)
Great. So the bots will find a way around it, with ease, while the filter gets stricter and stricter so more and more real persons get censured just for unfavorable opinions.
We know this stuff already happens, it's not new. What's new is that newspapers seem to support it. Maybe because they consider fb and twitter competition? Then again, it's the guardian, what else to expect of them, might as well dismiss it as fake news just because it's the guardian since they are anything but unopinionated.
Evolve and adapt (Score:2)
They will also add in random variants (mutations, if you like) and see which of those get caught and which are passed. As long as AIs require training on what has happened in the past they will always be one step behind.
Re: (Score:2)
It's as simple to get around as people simply "getting" what you're say in a sentence that only has one tiny typo (in form of changing into words of same pronunciations)
Ahh who can forget the joys of mentioning pr0n.
Everyone looks like a troll (Score:1)
Just another censorship tool. (Score:1)
Cat and mouse game (Score:1)
They'll just use AI and other techniques to work around the detectors. Patterns and counter patterns doing the tango.
Conclusion 1: So it excludes local trolls? (Score:2)
Conclusion 2: It is racist.
Conclusion 3: There is no "it". Just people, and their mindset, condensed into an universal function.
Let's just be clear (Score:2)
Felix Dzerzhinsky would have been TICKLED to have such a tool.
Insert neocon cyber BS (Score:1)
Sad, watching slashdot regurgitating this kind of neocon cyber BS on a once proud tech forum.
Re: (Score:2)
muddy the water Ivan, muddy the water
So the TrollTrace.com is finally here? (Score:1)
So the TrollTrace.com is finally here?
They could also (Score:1)
Post the i.p. address the person is posting from and if that's a VPN.
Look for peculiar grammar orders and vocabulary choices.
Determine if multiple people are posting from the same i.p. address.
If any of those are identified, then give the rest extra attention.
But there is the basic problem- most of these services have no humans in the loop any more. It's just automated systems.
You need humans to identify the troll posts before you can train A.I. on them.
You can't give something extra attention if there isn
'Trolls' (Score:2)
Damn, this happened right under my nose and I should have noticed it a long time ago, but I didn't really notice it until today.
"Troll" has become a synonym for sock puppet!
What a post (Score:1)