5 min read

Can LLMs fix the flaws in user reporting?

Large Language Models are being tested for everything from transparency to content review. But could they help modernise one of the oldest T&S processes — how users report harm and appeal moderation decisions?

I'm Alice Hunsberger. Trust & Safety Insider is my weekly rundown on the topics, industry trends and workplace strategies that trust and safety professionals need to know about to do their job.

This week, I explore the use of LLMs in user reports and appeals. I feel quite optimistic about the possibilities. However, I did find a real limit to LLMs: I asked one for a T&S-themed dad joke in honour of Father's Day and they were all serious duds. Here's the best one:

What’s a content moderator’s favourite pickup line?
“Are you violating terms of service? Because you’ve been on my watchlist all day.”

Happy Father's Day to all the dads reading this. As always, get in touch if you'd like your questions answered or just want to share your feedback (or if you have any better T&S-themed jokes to share). Here we go! — Alice


In PARTNERSHIP WITH Resolver Trust & Safety, Inviting you to our TrustCon Mixer!

Attending TrustCon this year? We’d love to help you kick off the week with a relaxed networking event hosted by Resolver.

Resolver’s Mixer
• Sunday, July 20
• 7:00 PM – 9:00 PM PT
• Hyatt Regency, San Francisco

Whether you're a long-time collaborator or just curious to meet the people behind Resolver's Trust & Safety efforts, this is a great chance to connect.

RSVP NOW

An LLM-aided model for user reporting

Why this matters: As I wrote about last week, current reporting systems are far from perfect. User reports often provide noisy, unreliable data, making it incredibly challenging for Trust & Safety teams to effectively address harm, especially for mid-sized platforms grappling with limited resources. However, new Large Language Models (LLMs) open up new possibilities for better, more personalised feedback to user appeals.

The idea of better explaining moderation decisions to users has been around for a while. 

My team at OkCupid talked about it six or seven years ago, when faced with the challenge of reforming transphobic users who were abusing the site’s reporting tool. Our idea was to try and teach posters about our values and guidelines with the view to lowering the number of user reports and increasing the quality of ones that we received. That seems hopelessly naive today, but, at the time, there were great strides in gay marriage equality so we were hopeful that education could help.

However, what we came up with was a hugely manual process in which my team sent dozens of handwritten replies and reminders to users. This was very resource-heavy, which was financially untenable for the team and the company at the time. So the idea was shelved. 

Get access to the rest of this edition of EiM and 200+ others by becoming a paying member