8 min read

Tesh Goyal on devising a new framework for dealing with online harassment

Covering: using Perspective API to automatically detect abuse and open-sourcing Harassment Manager for other newsrooms
Tesh Goyal, Google Research's Head of User Research for Responsible AI Tools
Tesh Goyal, Google Research's Head of User Research for Responsible AI Tools

'Viewpoints' is a space on EiM for people working in and adjacent to content moderation and online safety to share their knowledge and experience on a specific and timely topic.

Support articles like this by becoming a member of Everything in Moderation for less than $2 a week

Journalists, particularly women, are facing an online abuse crisis.

The evidence is overwhelming. Last year, the United Nations reported that online attacks — including threats of sexual and physical violence, fake accounts and “doxxing” — had led one in three to start self-censoring. Research by the Center for Media Engagement at the University of Texas found that "online harassment of women journalists is a global problem that needs to be addressed". In countries like Iran, women journalists face threats against their families and even torture.

I have no direct experience of this but I have seen a little of this up close. Working in several UK newsrooms from 2010 onwards, I'd often be contacted by a colleague that had been targeted by angry social media users for an article they'd written. Often, the vitriol was personal and had nothing to do with their writing or the argument they'd made. It was tough to read.

Without any means of influencing the social media platforms, I was only ever able to offer the same advice: block and try to ignore. But that's not easy, especially when it happens again and again and again. Looking back, I didn't realise the toll that it took on colleagues and on women and people of colour working in other newsrooms. And it's right that the UN has come out and said this trend "cannot afford to be normalised or tolerated as an inevitable aspect of online discourse".

It's for this reason that I'm interested in the latest work coming out of Jigsaw, Google's technology unit that "explores threats to open societies" and has worked closely with news organisations like the New York Times since 2016. Its main focus has been Perspective API, a free API to help sites and services moderate user-generated content. Wikipedia and Reddit are two of the beneficiaries.  

Latterly, the Jigsaw team has been creating a tool called Harassment Manager to help women journalists document and manage abuse. I asked Tesh Goyal, who led user research for the Conversation AI team at Jigsaw before being made Google Research's Head of User Research for Responsible AI Tools, about the research behind the tool, the team's collaboration with Thomson Reuters and the impact he hopes it will have.

The interview has been lightly edited.

How did the project come about? And why did you decide to focus on abuse suffered by journalists and activists rather than other targeted groups?

Jigsaw is committed to harnessing the power of technology to tackle the largest threats to open societies and democratic institutions. Toxicity and online harassment marginalize important voices, particularly those of political figures, journalists, and activists. According to IWMF, 70% of female journalists receive threats and harassment online, and more than 40% stopped reporting a story as a result.  This threatens open discourse.

Our internal user research across multiple geographies with 50+ journalists and activists confirmed that targets of online harassment face an unmanageable and overwhelming volume of abuse across platforms. Further, the solutions available to them are limited in efficacy and scope, require immense labor, and still fall short of providing the kind of evidence people need to find justice. Designing and building Harassment Manager is a first step toward providing people with tools they need to manage their own experience and take action, at scale.

Your research, published in February 2022, comes up with a new framework for experiencing harassment. Can you briefly explain it?

Thanks for asking about this important element of our work. We introduced the PMCR Framework to show that online harassment is not a monolithic event, and targets of harassment have different needs as they experience different stages of harassment. There are 4 needs: Prevention, Monitoring, Crisis, and Recovery. When we mapped a chronological trajectory of harassment, we realized that there are multiple stages to harassment: before harassment, during harassment, and after harassment. In close collaboration and participatory research with targets of harassment, we realized that these stages have overlapping needs. Focusing on needs, as opposed to stages, unlocks real potential to help people.

“Prevention” is about safeguarding self to prevent future instances of harassment, and this is ongoing. An example of this might be constantly adjusting privacy and security settings. Similarly, “Monitoring” refers to surveilling potential harms and harmful actors, and this is a constant struggle across all the stages as well. An example of this might be setting alerts. Crisis and Recovery refer to the instance of an ongoing attack, and how to navigate the steps following the attack. During a Crisis, a person needs to identify what/who is causing the harm. Recovery requires creating evidence to manage this harm, amongst other user needs.

We are hoping that the community will start using the PMCR framework to further understand the needs of targets of harassment, or as a starting point to focus on users and their needs first to evolve the framework into what might be more useful for their context.‌

Viewpoints are about sharing the wisdom of the smartest people working in online safety and content moderation so that you can stay ahead of the curve.

They will always be free to read thanks to the generous support of EiM members, who pay less than $2 a week to ensure insightful Q&As like this one and the weekly newsletter are accessible for everyone.

Join today as a monthly or yearly member and you'll also get regular analysis from me about how content moderation is changing the world — BW

Why did you decide to focus design efforts on the Crisis and Recover phases rather than anything further upstream?

We think that all user needs (Prevention, Monitoring, Crisis, and Recovery) are important. Multiple organizations (including PEN America, OnlineSOS, Hollaback, CCRI, Women TechMakers) have been playing an important and useful role in providing resources on how to best protect themselves to satisfy their prevention and monitoring needs. We support these efforts and collaborate with them. During our own research, including with Nobel Peace Prize winners and nominees, multiple journalists and activists pointed out that they are stuck in a cycle of Crisis and Recovery today. They need to collect irrefutable evidence of harassment to recover from the crisis. By focusing on these needs related to Crisis and Recovery, we were able to be a complementary partner to resources provided by the organizations listed above.

In our research, we also heard frequently that managing harassment during Crisis and Recovery requires significant emotional and physical labor. This includes engaging with harassment, documenting all of the content as screenshots, managing this content locally on one’s computer, and curating and reporting it across multiple channels. As you can imagine, this can be incredibly inequitable and debilitating. By focusing on Crisis and Recovery, we were able to leverage ML models built-in Perspective API to automatically detect harmful content and add it to reports on the user’s behalf, alleviating some of this labor. We also designed an interface that centers on the well-being of the users when they are facing a crisis involving a large volume of harassment. The tool defaults to blurring out content and provides warnings to ensure users are ready to engage.

You utilised Perspective API, Google’s open-source tool for detecting toxic texts. Why did you decide to use that over other options?

Throughout the process of building and iterating on Perspective API, our team started to see clearly that there were different types of users who were affected differently by toxic content. We understood that people who are experiencing harassment online may have a separate need from content moderators, and started to explore how we might leverage parts of Perspective API to address those needs in a more compassionate way, through Harassment Manager.

Perspective API, built by the Conversation AI team behind Harassment Manager, uses machine learning to identify toxic comments, making it easier to host better conversations online. It’s used daily by ~500 developers, news publishers and platforms including Reddit and The New York Times for content moderation to make conversations safer. Perspective API is now available in 17 languages. We are also continuously thinking about how to improve the Perspective API. For example, we have been investigating how the identity of data annotators impacts dataset quality that trains our ML models. We will be publishing our results later this year. So, as a team that has built and is continuously improving the Perspective API, it was a natural fit to find more ways of using this API beyond content moderation. We wanted to open source Harassment Manager, and leveraging another open-source API like Perspective API made this easier.

You also partnered with Twitter for this project, presumably because it’s the main platform of use for journalists. What support did it provide and why do you think they came on board?

Twitter continues to be an incredibly important mechanism for journalists to engage with the wider community. Given the prominence of the Twitter platform for many communities of journalists and activists, and the fact that Twitter has recently developed several APIs that can be leveraged by users to take action on their own feed, we thought that building a tool on Twitter was a solid place to start.

Another example of Twitter’s commitment to this space was their launch of “nudges” to tweet authors in 2021 making them aware that their tweet might be violating Twitter policies. Our Authorship Feedback user research in 2019 with Coral, OpenWeb, and SouthEast Missourian found that this nudge theory-based approach proved that this works. So, it is wonderful to see Twitter implement it on their platform as well.

Owing to Twitter’s interest in this space, we were provided access to the Twitter API, which subsequently enabled us to engineer the tool over multiple iterations. This Twitter version of the tool is just the first step. Now that the code for Harassment Manager is open-sourced, our hope is that other ecosystem players can adopt it, iterate on it, and offer this tool to their communities. It could be expanded for cross-platform use, for example.

The code for Harassment Manager was recently open-sourced and will be developed by Thomson Reuters. What impact do you hope to see when the tool is rolled out to its staff?

As a first implementation partner, Jigsaw is working with Thomson Reuters Foundation to roll this tool out to journalists in its network in the next few months of 2022. As a leading defender of media freedom, TRF is a natural fit to make a technical solution available for its community of journalists. Our research with potential users at TRF has been encouraging so far and more details about the impact will be available when our second paper about the evaluation of Harassment Manager gets published later this year.

Have other organisations — news or otherwise — come forward since the announcement about using the tool for their staff?

We are encouraged by the positive feedback we’ve received since open-sourcing the Harassment Manager code. On the research side, the Association of Computing Machinery (ACM) has recognized the paper detailing the first half of this work by awarding this as a Best Paper Honorable Mention at SIGCHI CHI 2022.

Our first implementation partner, the Thomson Reuters Foundation, is making strides toward offering the tool to its network of journalists, on track to launch this summer. We’ve heard from a number of other organizations and NGOs who are starting to explore building this tool, and we aim to get them the tools they need to roll it out accordingly.

What plans does Twitter have for incorporating any of the functionality of Harassment Manager natively into the platform?

While Jigsaw cannot speak on behalf of Twitter’s plans, the platform remains committed to providing tools to help users take control of their experience. Twitter’s collaboration with Jigsaw on Harassment Manager will further enable NGOs and other ecosystem partners to leverage the Twitter API to build valuable tools for the communities they serve, and we are looking forward to seeing the innovation that comes from this important partnership.

What other applications can Harassment Manager have? For example, have you spoken to colleagues in YouTube’s Trust and Safety team about what could be learned?

Jigsaw has open-sourced the code for Harassment Manager so that anyone can use it and adapt it to the specific needs of their communities. We are working closely with an ecosystem of partners who have expressed interest in the tool, and we aim to get them the tools they need to roll it out accordingly.  

We also intend to learn from these partners’ feedback about what’s working, what’s missing, and what else could be built. This is true across types of platforms as the user base grows. It is our hope that open-sourcing this code is the first step, and that other types of platforms could ultimately adapt it to offer the tool as well.

Want to share learnings from your work or research with 1000+ people working in online safety and content moderation?

Get in touch to put yourself forward for a Viewpoint or recommend someone that you'd like to hear directly from.