4 min read

📌 Redesigning how users report, Koo's corpus of offensive language and meeting 'The Decider'

The week in content moderation - edition #140

Hello and welcome to Everything in Moderation, the weekly digest about all things content moderation. It's written by me, Ben Whitelaw.

New subscribers from Tripadvisor, Boston University and Ofcom, a special hello to you.

This week's edition is jam-packed and has the strong sense of people — whether working in product or policy — getting things out of the door before this crazy year draws to a close.

Next week's EiM (17th December) will be the last of 2021. I hope to have a few surprises to round off the year so stay tuned for that edition — BW

📜 Policies - emerging speech regulation and legislation

Let's begin with some good news: A set of widely-endorsed recommendations created in 2018 to help platforms more fairly enforce their community guidelines has been updated following a two-year consultation. The Santa Clara Principles continues the transparency and accountability efforts of its original set and, now co-authored by 14 organisations, contains both foundational principles as well as operational expectations of platforms that endorse the recommendations. Having seen the real-world effect of its first iteration, I'm very pleased that these updated Principles have been published.

In not-at-all-surprising news, Twitter's new private information policy (EiM #139) has led to a "significant amount" of malicious reports, according to The Washington Post, and caused enforcement teams to make "several errors". As Mike Masnick at Techdirt points out, there seems to have no plan to counter the inevitable abuse of the moderation process. Twitter has said it will conduct an internal review to ensure the policy was "used as intended".

💡 Products - the features and functionality shaping speech

Reporting content, for the most part, has barely changed online in almost two decades. You hit report and then pick the policy that you think might has been violated. Not for much longer, at least at Twitter. A new blog post outlines how the platform is "refocusing on the experience of the person reporting the Tweet" by using a "symptoms first approach" and first asking what happened. Initially being trialled with a small group of US users, I expect this will have a profound effect and that other platforms will follow suit. (Thanks Ian for sharing)

Talking of reporting content, the UK Safer Internet Centre is conducting a survey about its Report Harmful Content service (I'll be honest, it's new to me) that provides advice and supports users to report offending content on platforms.

Automated moderation has long been touted by Facebook as the answer to online harm (I often recall this 2016 piece) and, while it's still years away from delivering on its promise, a paper published by Meta's AI research group believe it has made a breakthrough. According to Engadget, the new technology — known as "few-shot learner" (FSL) — leverages something called "entailment" in its model to turn classifications into natural language sentences that mean fewer examples of malicious content is needed for systems to respond to it. It's a bit over my head but, if you're interested, read the paper.  

Dozens of "reportation bots" created by TikTok users and shared via Github are cropping up as users take moderation into their own hands, according to this interesting LA Times piece. In it, reporter Brian Contreras speaks to a 14-year-old from Denmark who created a mass reporting script to "eliminate those who spread false information or … made fun of others" as well as programmers from Saudi Arabia, Hungary and Kurdistan who have targeted bullies and paedophiles. Commendable stuff although I'm sure others are using the same tools for nefarious means. My story of the week.

Are you concerned about the spread of harmful content on social media? Do you have an opinion about who should be responsible for deciding what content can or cannot stay up?  Jenny Fan and Sophie Zhang, whose work on digital juries I've featured in the past (EiM #72), are looking for people (particularly users of Discord, Stack Exchange, Facebook groups, Reddit, and Twitter) to take a short 15-minute survey as part of a new piece of research. Participants are eligible for a raffle to win $50 gift cards.

💬 Platforms - efforts to enforce company guidelines

It wasn't long ago that Facebook was having to confirm that it had a secret moderation programme for its high-profile users (EiM #128). Now, it has been revealed that Twitter has the same thing, albeit with a cuter name. "Project Guardian", according to Bloomberg, was created two years ago (when Jack Dorsey gave the platform a 'C" for combatting abuse and around the time a host of black footballers in the UK were targeted by racist abuse) and is made up of celebrities and viral sensations targeted by harassment.

Other news coming out of the blue bird app this week saw the announcement of a moderation research consortium in early 2022 to "expand transparency about our content moderation decisions". Membership of the consortium will be first offered to the 200 researchers that already have access to Twitter's hashed operational datasets with other global experts being invited to study governance issues on the platform soon.

India's Twitter clone, Koo, announced this week that it will partner with the Central Institute of Indian Languages (CIIL) to produce a corpus of expressions "that are considered offensive or sensitive across 22 languages". The list of words, phrases, abbreviations, and acronyms will be used across to make its products safer, according to reports.

👥 People - folks changing the future of moderation

I have a huge amount of respect for the people working at platforms tasked with making calls on offending content. I don't envy the power they hold but I do really like hearing from them about how they dealt with responsibility.

One such person was Nicole Wong, who was Google's deputy general counsel between 2004 and 2011 and the person in charge of making decisions when users and governments raise complaints. She was known as "The Decider" at the company and, in this new podcast interview with Lawfare, explains what that was like (including having calls with colleagues around the world from 5am till way beyond my bedtime).

Wong went on to be Twitter’s legal director of products and the deputy chief technology officer of the United States under the Obama administration so she knows her stuff. Worth a listen.

🐦 Tweets of note

  • "This is a major stand against the realities of today's online content." - Spectrum Labs' Justin Davis reacts to recent news that Lush Cosmetics have quit four major social media platforms because they feel like "places no one should be encouraged to go.”
  • "Being recognized for my research on tackling violence and abuse against women online and many years of advocacy to improve online spaces for women really is an incredible feeling!" - Azmina Dhrodia, safety policy lead at Bumble, on a much-deserved accolade.
  • "Rough math suggests it would take 8000 years to just process all of this year's cases!" - Hashtag inventor Chris Messina is sceptical about the Oversight Board's mammoth case backlog mentioned in its Q3 transparency report.