Language models as community moderators

Jun 07, 2023

∙ Paid

Community moderation works. This was the overwhelming lesson of the early internet. It works because it mirrors the social interaction of real life, where social groups exclude people who don’t fit in. And it works because it distributes the task of policing the internet to a vast number of volunteers, who provide the free labor of keeping forums fun, because to them maintaining a community is a labor of love. And it works because if you don’t like the forum you’re in — if the mods are being too harsh, or if they’re being too lenient and the community has been taken over by trolls — you just walk away and find another forum.
— Noah Smith, The internet wants to be fragmented

If you want to have public conversations of high quality, algorithmic content filtering is the wrong approach. As Noah Smith points out in the quote above, we’ve known for a long time that community moderation works better. But the internet has largely moved away from that, for several reasons. First, community moderation reduces conflict which is bad for Twitter and other social media companies that want to increase engagement to sell more ads.

Second, community moderation doesn’t scale. To maintain good norms around conversations, human moderators need to read what is written and remove low-quality comments that otherwise, like a broken window on a street, would undermine the standards so that more people start making stupid comments. (Most communities of size do not moderate this strictly but limit moderator overview to flagged comments, which leads to a corresponding drop in quality.) There is an inverse correlation between community size and quality of conversation.

My hunch is that conversations on platforms that rely on algorithmic content filtering are going to deteriorate even more when generative AI models start flooding the feed with convincing bots. The next US presidential election will give us a sense of how crazy that will be.

But the same AI models that threaten to undermine platforms like Twitter could also power more productive speech communities. Following Lars Doucet and Maggie Appleton, I predict that we’re going to see an acceleration of the trend of people moving off the big platforms in favor of chats and smaller forums that are community moderated and often gated. In these self-selected communities with clear norms, AI models can help out by automating some of the work that community moderators do. This could allow us to grow moderated communities larger than we’ve been able to before.

AI is already being used for community moderation in a small way, with OpenAI and others offering tools that can flag comments by reading them and indicating if they say anything offensive. This is still fairly close to normal algorithmic moderation. But as prices drop, we can go much further. We can do fluid taste-based moderation.

Escaping Flatland

Language models as community moderators

This post is for paid subscribers