Skip to content

Facebook’s content moderation ‘happens too late,’ says new Northeastern research

A person's thumb hovers over the Facebook icon on a smartphone.
Research from Northeastern University finds a “mismatch” between the speed of Facebook’s content moderation and its recommendation algorithm. Photo by Alyssa Stone/Northeastern University.

Whether from the White House or a neighbor in your Facebook community group, a request to remove a Facebook post can prompt accusations of censorship or misinformation, or even become a meme. 

But does removing a post really matter? 

New research from Northeastern University finds that Facebook posts removed for violating community standards or other reasons had already reached at least three-quarters of their predicted audience by the time they were taken down.
“On Facebook, content moderation doesn’t have much impact on user experience because it happens too late,” says Laura Edelson, assistant professor of computer sciences at Northeastern. 

All major social media platforms use content moderation as a tool to prevent harmful or illegal content from spreading on their sites.

Such companies have been cagey, however, about what exactly that content moderation means and the extent to which it is practiced, critics say.

Plus, it’s hard to know the effects of that content moderation — after all, how can you measure what didn’t happen?

That’s where Edelson comes in — proposing a new metric called “prevented dissemination.” Prevented dissemination uses machine learning to analyze millions of posts and predict a given post’s future dissemination. 

“We wanted to understand what the impact of content moderation was and, in order to do this, the question we’re really asking is, if takedown didn’t happen, what would have happened?” Edelson explains. 

To develop and then test this metric, Edelson and colleagues examined over 2.6 million Facebook posts from 17,504 unique news and entertainment pages with content in American English, Ukrainian and Russian. 

The researchers monitored the posts regularly (every six hours for English pages; and every two hours for the less numerous Ukrainian and Russian pages) between June 17, 2023, and Aug. 1, 2023, to see if and when posts were removed and how quickly posts accrued engagements such as likes, shares, comments, etc. 

The researchers found that only a few posts were responsible for the majority of user engagements. 

In fact, the top 1% most-engaged content was responsible for 58% of user engagements in American English, for 45% of user engagements in Ukrainian and 57% in Russian content.

Those engagements also happen quickly. 

The researchers found that 83.5% of a post’s “total” engagement is accrued during the first 48 hours the post is up and that it took a median time of three hours for a post to achieve its first 50% of engagements.  

As for the posts that were removed, the researchers found that this was a small group.

Of the 2.6 million posts analyzed, researchers found 12,864 posts in English were removed (0.7%), 1,071 posts in Ukrainian (0.2%) and 2,223 posts in Russian (0.5%), according to the study.

And, although the research did not determine exactly why a post may have been removed, Edelson notes that the majority of the removed posts were “a variety of spam.” 

“This is what most content moderation on platforms are focused on — things that are clickbait, things that are spam and things that are fraud,” Edelson says. 

Finally, the researchers found that removing posts prevented only 24% to 30% of the posts’ predicted engagement. 

“What this tells us is that if content moderation is going to have an impact on user experience — which is to say, if a platform is going to use content moderation as a strategy to not show users bad stuff — that content moderation needs to happen at the same speed as the content algorithm recommends things to people,” Edelson says. “In this case, Facebook has a fast feed algorithm and slow content moderation.

“It’s not necessarily a problem that content moderation is slow; it’s not necessarily a problem when a feed algorithm is fast,” Edelson says. “The problem is the mismatch between the two.”