5 min read

Content policy is basically astrology? (part one)

Moderating with large language AI models could open up new ways of thinking about content policy, moderation, and even the kinds of platforms that are possible. But there may be downsides too.

I'm Alice Hunsberger. Trust & Safety Insider is my weekly rundown on the topics, industry trends and workplace strategies that trust and safety professionals need to know about to do their job. This week, I'm thinking about:

  • How moderating with LLMs is going to change everything
  • Resources for pivoting to T&S from another industry

Get in touch if you'd like your questions answered or just want to share your feedback. I noticed the reply-to email on this newsletter last week didn't work, but it should be fixed now. Sorry about that!

Here we go! — Alice

Humans & LLMs: who does what?

Why this matters: Moderating with large language AI models is going to change everything, says former OpenAI head of T&S, Dave Willner. It will open up new ways of thinking about policy, moderation, and even the kinds of platforms that are possible. But there may be downsides too.

I promise I'm not going to write about AI and the future of T&S every week, but Dave Willner's recent talk at the Berkman Klein Center for Internet & Society— "Moderating AI and moderating with AI" — made it impossible to not bring up again.

In it, the former OpenAI, AirBnb and Facebook veteran suggests that large language AI models (LLMs) should replace the bulk of human moderators. "This is a really big deal if you accept the case I made to you," he says.

Willner's case boils down to this: that problems with bias and mistakes from LLMs are "static, engineering-shaped problems" that can be solved, unlike the "roiling mass of chaos" that are human moderators at scale. He predicts mass job loss for frontline moderators, but also an increase in higher-level jobs 1) overseeing running QA for AI (which is similar to what I talked about last week) and 2) writing incredibly detailed policy documents for the LLMs to run from. In short:

"This does not just mean we will lift and drop AI in the place of human frontline moderators. It will change of the kinds of systems that are viable to have. It opens up new possibilities of moderation."

Using LLMs for moderation instead of humans will unlock fundamentally different ways of thinking about policy, he says:

"Right now, a lot of content policy is basically astrology about how content moderators will react to the words that you wrote."

As he explained in a recent Tech Policy Press piece with former Meta staffer Samidh Chakrabarti, we'll also be able to test different versions of policies with LLMs — thus pre-empting unintended consequences of guideline changes — and even create new kinds of online platforms that we haven't even dreamed up yet.

But, where are the humans?

One interesting thing that Willner leaves out of his main talk but addresses in the Q&A is is the following question: do we want humans to be involved in this process and, if so, in what capacity?

Regulation will guide the answer to this question to some degree. Gartner recently predicted that, over the next few years, the European Commission will mandate that customers have "the right to speak to a human" during customer service interactions. And it already exists today in the language of the Digital Services Act, which states that internal complaint-handling systems are "are subject to human review where automated means are used."

When asked about how regulation will shape human involvement, Willner replies:

"I think the rise of gen AI problematises a bunch of the European moves here, because they assume that a human answer is going to be better, more fulfilling, more correct, and my basic thesis is that's wrong."

So it's clear he sees an increasingly smaller role being played by humans. Except in one area.

Willner predicts that, as we (as an industry) get better at the classification job of moderation, moderation will matter less and the actual values that moderation is based on will matter more. He doesn't say it explicitly but it's clear who will decide those values: humans, of course.

Better for everyone?

With humans still involved — albeit at a higher level — Willner says that we will still have to account for bias in LLM systems, but predicts that overall this system will be better for marginalised people:

"Part of the perverse shadow of the request for more cultural context being injected into moderation is that it's essentially a call for the enlistment of people who are victimised by speech in the controlling of that speech to begin with, which is perverse when we think about it that way."

Here, I couldn't help think about another talk in the same Berkman Klein series in which Nadah Feteih and Anika Collier Navaroli discuss discuss the experience of being a marginalised tech worker and the concept of "compelled identity labor", where people are asked to work on a problem because of their identity. We've all seen this many times before; for example, a Black person being asked by a company employee about how to deal with race issues.

In the world that Willner outlines, where values are front and centre and every policy decision must be A/B tested meticulously with LLMs before going live, it will be vitally important to have a diverse set of people in the room while designing policy. For these professionals — who are responsible for a model's values and know that their decisions will be instantly scaled across a platform — an LLM-enabled policy creation process will be just as hard.

Read part two on this topic and the trade-offs between LLMs and humans in next week's T&S Insider.

You ask, I answer

Send me your questions — or things you need help to think through — and I'll answer them in an upcoming edition of T&S Insider, only with Everything in Moderation*

Get in touch

Job hunt

One of the things that I'm asked frequently is how to pivot to a career in Trust & Safety from another industry. A resource that many don't know about is the Trust & Safety Professional Association's YouTube page. Here are a few videos on switching to a career in T&S that might be useful:

The TSPA also has a careers page with a FAQ and further resources and links, one of which is this helpful article, How do you launch a career in Trust & Safety? by David Ryan Polgar, founder of All Tech is Human.

Ctrl-Alt-Speech is the new weekly podcast covering the week's major stories about online speech, content moderation and internet regulation from Everything in Moderation's Ben Whitelaw and Mike Masnick of Techdirt.

In this week's podcast, Mike and Ben cover the Murthy v Missouri oral arguments in the Supreme Court, whether Reddit's IPO is a "content moderation success story" and a host of other need-to-know stories. Have a listen and email podcast@ctrlaltspeech.com with your feedback and thoughts for future episodes.

Also worth reading

Persistent interaction patterns across social media platform and over time (Nature)
Why? Researchers studied behaviour patterns across 30 years of social media, and found no significant change in toxic content online, suggesting that human behavior is what makes the internet toxic, not modern social media platforms. One interesting point: longer conversations tend to be more toxic.

Reddit's IPO is a content moderation success story (The New York Times)
Why? The author suggests that Reddit's volunteer moderation system and recent focus on community contributed to the successful IPO, and proves the value of investing in moderation.

Key Findings from Stanford Event with Youth Online Safety Leaders & Federal Task Force (Stanford Internet Observatory Policy Center)
Why? Youth safety experts say more research is needed yet it's urgent to take action. They highlight that it's critical to listen to youth voices when designing solutions, and that there needs to be room for nuance and choice.

The AI Act is done. Here's what will (and won't) change. (MIT Technology Review)
Why? A good summary of the EU's new AI Act, highlighting which AI uses will be banned, what transparency measures will be required, and complaint mechanisms.

Shrimp Jesus (404 Media)
Why? It's shrimp Jesus!