Jun 1, 2026 5 min read T&S Insider

T&S Insider readers had questions; I answered

Debunking myths, working with AI — and working against it!

I'm Alice Hunsberger. Trust & Safety Insider is my weekly rundown on the topics, industry trends and workplace strategies that trust and safety professionals need to know about to do their job.

For this week, I put an AMA out on Linkedin. Here’s a selection of what you asked, along with my answers. Remember, this is never just a one-time thing. You can email me at any time, with any T&S questions you may have.

Here we go! — Alice

What are the myths in content moderation using AI, if any? Which are true, which are false, what could have been done better? | Anna Truong

The first myth is that using AI for T&S is new. Machine Learning is a type of AI, and T&S teams have been using ML models to automate T&S queues for a long time now. But let’s assume you mean using LLMs for T&S. LLMs are a much newer technology and so come with their own unique myths.

One myth I see a lot is that LLMs can’t be good at moderation because they won’t always come up with the exact same answer every single time — they’re probabilistic, not deterministic. The thing is, humans are not super consistent in their decision-making either, and they’re often held up as the gold standard for moderation. There have been plenty of times that I’ve made a moderation call, then gone back and looked at it again, and realized I was wrong. While LLMs won’t always have the exact same answer for every single moderation decision, they do come up with the same answer most of the time. And the times they don’t, they can give a reason, which often gives clues as to what the issue might be. From there, you can change the policy and steer the model to come up with the right answer every time. Whereas a fixed ML model will always be wrong until it gets retrained.

Another myth I hear a lot is that LLMs are biased, so they’re not appropriate for content moderation. If you ask an LLM a broad question, it will give you the most probabilistic answer, which could well be biased — as the LLM is trained on biased human data. But as mentioned above, LLMs are not rigid tools; you can tweak and steer them to work for your specific policies, and avoid or disregard the learned bias. LLMs have also been trained on a lot of material around equity and bias mitigation so when steered in that direction, they can do very well.

The final myth is that with LLM-optimized policies, you can kind of set-it-and-forget-it. This would only be true if the LLM landscape wasn’t constantly shifting: models change and drift, new models come out all the time (and others are no longer supported), and policy needs to change as you find more edge cases or world events happen. Keeping track of all of that can be complicated, especially as you have more than one person who is doing prompt engineering and optimization.

Get access to the rest of this edition of EiM and 200+ others by becoming a paying member

You might also like...

What TrustCon's agenda says about T&S in 2026

How to use AI for policy creation & iteration

Parental controls should be standardised

We’ve never been better at child safety

The big unknowns about agentic AI