5 min read

Fair moderation is hard but fair, scalable moderation is harder

Throughout my career, I’ve struggled with the problem of how to enforce policies in a fair, accurate, and scalable way. A new research paper reminds us just how difficult that is

I'm Alice Hunsberger. Trust & Safety Insider is my weekly rundown on the topics, industry trends and workplace strategies that trust and safety professionals need to know about to do their job.

This week, I'm thinking about content moderation at scale, and what role frontline moderators have to play.

Thanks to everyone who reached out about last week's edition– it seems like it really resonated with many of you. Get in touch if you've got ideas about what I should write about next.

Here we go! — Alice


SPONSORED BY RESOLVER, fighting the next Kidflix before it happens

The recent takedown of Kidflix — one of the largest paedophile platforms in the world — revealed a chilling truth: child exploitation online is no outlier - it’s engineered into the infrastructure of the internet.

With CSAM spreading through advanced tech and everyday platforms, safety can’t be an afterthought. Resolver Trust & Safety works at the frontline, helping platforms design systems that detect and disrupt abuse before it scales.

Because the next Kidflix is already being built - and the time to act is now.

READ MORE

What it would take to prioritise quality in content moderation

Why this matters: Throughout my career, I’ve struggled with the problem of how to enforce policies in a fair, accurate, and scalable way. I’ve tackled this problem as head of T&S at platforms, at a BPO, and currently at an AI moderation solutions provider. The challenge is the same everywhere: how to apply consistent, enforceable rules to content that is constantly shifting in meaning and context. A new research paper makes us think again about the quality vs scale trade-off.

We often talk about moderation as a numbers game: how fast can we respond, how much can we automate, how many decisions can we review per minute. But for those of us who’ve worked across the system, the real question is more fundamental: how do we know if we’re getting it right?

A recent research paper — The Role of Expertise in Effectively Moderating Harmful Social Media Content— has looked at this question in the context of real-world harm and genocide. The authors investigated social media moderation during times of conflict and genocide, focusing on posts targeting Tigrayans during the 2020-2022 Tigray war. It came to the interesting conclusion that:

“open discussions and deliberation meetings where moderators can exchange contextual information and resolve disagreements to be more effective ways of ensuring that harmful content is appropriately flagged.”

They did this by giving moderators space to think more deeply, take context into account, exercise their judgment, and then spend as much time as they needed to make the decisions themselves:

"We decided on 55 posts per week to restrict the participants’ exposure to harmful content to a maximum of one hour per day, based on feedback from participants and the psychological impacts this content can cause. After each round of annotation, our expert annotators spent a minimum of 60 minutes discussing their disagreements and arrived at a final agreed-upon label for each post after deliberation."

What was notable to me was that, even after this lengthy deliberation step, there were still significant disagreement; as high as 71% and, after several rounds, a still very high 33%. This shows how difficult making moderation decisions is.

Get access to the rest of this edition of EiM and 200+ others by becoming a paying member