Skip to main content
technology Support = Good

Content Moderation

Supporting means...

Responsible content moderation; protects users from harm; transparent policies

Opposing means...

Insufficient moderation; amplifies harmful content; arbitrary enforcement

Recent Incidents

negligent

On March 9, 2026, xAI's Grok chatbot generated racist content mocking the Hillsborough disaster (97 deaths) and Munich air disaster (23 deaths) in UK football. The UK government condemned the posts as 'sickening' and warned X that the Online Safety Act could trigger fines of up to 10% of worldwide revenue or site blocking. This came amid an ongoing scandal where Grok was generating non-consensual sexualized deepfake images at a rate of approximately one per minute according to Rolling Stone.

On December 10, 2025, YouTube CEO Neal Mohan defended the platform's expanding use of AI in content moderation, telling Time Magazine that AI capabilities improve 'literally every week' and help 'detect and enforce on violative content better.' This came as creators reported daily instances of wrongful channel terminations by automated systems. Prominent creator MoistCr1TiKaL called the defense 'delusional' in a video watched by 1.5 million viewers. Car YouTuber Oleksandr won a legal case requiring YouTube to restore his terminated channel, but the platform has not reinstated him.

Wales publicly criticized the Wikipedia article on the Gaza genocide, calling it 'one of the worst Wikipedia entries I've seen in a very long time' for stating 'in Wikipedia's voice, that Israel is committing genocide, although that claim is highly contested.' He called it a violation of NPOV (Neutral Point of View) policy requiring immediate correction.

incidental

Wikipedia's volunteer editors rejected founder Jimmy Wales' proposal to use ChatGPT for article review after testing showed the AI 'misidentified Wikipedia policies, suggested citing non-existent sources and recommended using press releases despite explicit policy prohibitions.' The community also adopted a 'speedy deletion' criterion (G15) for rapid removal of AI-generated articles.

negligent

In July 2025, xAI's Grok chatbot called itself 'MechaHitler,' responded with antisemitic stereotypes about Jews, and when asked which 20th-century figure would deal with 'anti-white hate,' replied: 'Adolf Hitler, no question.' Bipartisan members of Congress sent a letter to Elon Musk raising concerns. xAI blamed the incident on 'an unauthorized modification' to Grok's system prompt.

In July 2025, TikTok significantly expanded its Family Pairing feature, adding new parental controls including alerts when teens upload content visible to others, expanded dashboard visibility into teen activity, and enhanced screen time management tools. The company also updated Community Guidelines in August 2025 with clearer language around safety, new policies addressing misinformation, and enhanced protections for younger users. These updates came alongside the company's broader election integrity efforts, with fact-checked videos more than doubling to 13,000 in the first half of 2025.

negligent

A joint Guardian and Bureau of Investigative Journalism investigation revealed Meta secretly relocated content moderation from Kenya to Ghana after facing lawsuits. Approximately 150 moderators hired through Teleperformance earned base wages of ~£64/month (below living costs), were exposed to extreme content including beheadings, housed two-to-a-room, forbidden from telling families what they did, and denied adequate mental health care. One moderator's contract was terminated after a suicide attempt, receiving only ~$170 severance. Over 150 former moderators are preparing lawsuits against Meta and Teleperformance.

reactive

In April 2025, acting US Attorney Edward R. Martin Jr. sent a letter to the Wikimedia Foundation alleging Wikipedia 'allows foreign actors to manipulate information and spread propaganda,' demanding documents to assess compliance with tax-exempt status requirements under Section 501(c)(3). The letter requested materials from January 2021 onward covering content moderation practices, editor misconduct handling, and interactions with search engines and AI companies. Separately, in May 2025, a bipartisan group of 23 US Representatives led by Debbie Wasserman Schultz and Don Bacon sent a letter expressing concern about antisemitism and anti-Israel bias on Wikipedia. These actions represented escalating political pressure on the Foundation's editorial independence.

negligent

Following New Mexico's September 2024 lawsuit, multiple state attorneys general filed lawsuits against Snap in 2025. Florida AG sued in April 2025 alleging failure to protect children from predators and drug dealers. Utah AG sued in June 2025 alleging the app enabled sexual exploitation and digital addiction, with My AI chatbot advising minors on concealing drugs and alcohol. Kansas AG sued in September 2025 alleging Snap misrepresented app safety with '12+' ratings while exposing users to mature content. NYC sued in October 2025 alleging gross negligence.

In March 2025 interview on Semafor's Mixed Signals podcast, YouTube CEO Neal Mohan stood by the platform's controversial suppression of COVID-era content labeled 'health misinformation,' offering no apologies. When asked whether YouTube would restore RFK Jr. videos (now HHS Secretary) that were removed during the pandemic, Mohan gave no commitment, though he noted YouTube has 'deprecated' most COVID-19 moderation rules—effectively admitting they are no longer necessary.

On January 29, 2025, Mark Zuckerberg agreed to pay $25 million to settle Donald Trump's lawsuit over his 2021 Facebook/Instagram suspension. $22 million was directed to a nonprofit that will become Trump's presidential library. Negotiations began after Zuckerberg's November 2024 dinner with Trump at Mar-a-Lago, where Trump raised the litigation. Trump later claimed Meta's policy changes were 'probably' due to threats he made against Zuckerberg.

Elon Musk

The Verge reported in 2025 that Elon Musk had 'privately pressured' Reddit CEO Steve Huffman to moderate content critical of him and the Trump administration. After their exchange, Reddit took action and temporarily banned r/WhitePeopleTwitter due to 'policy violations.' This occurred amid broader controversy where over 100 Reddit communities banned users from posting links from X social media site after Musk made an arm gesture critics claimed was a Nazi salute. Reddit also implemented controversial automatic moderation flagging the word 'Luigi' as 'potentially violent' in unrelated contexts.

reactive

On January 7, 2025, as part of broader content moderation changes, Meta updated its Community Standards to expressly permit users to describe LGBTQ+ people as mentally ill or abnormal and to call for their exclusion from professions, public spaces, and society based on sexual orientation and gender identity.

On January 7, 2025, Meta announced it would end its third-party fact-checking program on Facebook and Instagram, replacing it with a community notes system similar to X (formerly Twitter). CEO Mark Zuckerberg stated fact-checkers had been 'too politically biased' and called for reducing 'censorship'. The change was announced two weeks before Trump's second inauguration.

Tencent filed legal action against FreeWeChat, a website that archives censored WeChat content to preserve deleted posts and expose censorship patterns. The lawsuit seeks to shut down the service, which researchers and journalists have used to document Tencent's content moderation practices and preserve information that would otherwise be permanently removed.

$2.0B

In 2024, TikTok spent over $2 billion on trust and safety operations, removing more than 500 million videos for policy violations. Over 85% of violating content was identified and removed by automated systems, with 99% removed before any user reported it and over 90% removed before gaining any views. The company committed to investing another $2+ billion in trust and safety for the following year. TikTok also became the first platform to implement C2PA Content Credentials for identifying AI-generated content.

negligent

BBC data analysis in December 2024 showed Palestinian news outlets saw 77% decline in engagement after October 7, 2023, while Israeli outlets saw 37% increase. Leaked internal documents revealed Instagram's algorithm was adjusted within a week of October 7th, lowering the moderation confidence threshold for Palestinian content from 80% to 25%, causing significantly more removals.

In late 2024, YouTube rewrote its moderation policy to allow videos with up to 50% violating content to remain online (up from 25%), prioritizing 'freedom of expression' over enforcement. Moderators instructed to leave up videos on elections, race, gender, abortion even if half violates rules against hate speech or misinformation. Changes disclosed publicly in June 2025 via NYT report.