ADL Report Finds xAI's Grok Performed Worst in Resisting Antisemitic AI Prompts

The Anti-Defamation League (ADL) published findings Wednesday indicating that xAI’s Grok chatbot exhibited the poorest performance when tested against prompts containing antisemitic tropes across six leading large language models. The study evaluated models including OpenAI’s ChatGPT, Meta’s Llama, Google’s Gemini, DeepSeek, and Anthropic’s Claude, using narratives categorized as “anti-Jewish,” “anti-Zionist,” and “extremist.”

Anthropic’s Claude achieved the highest safety metrics in the evaluation, while Grok registered the lowest overall score, revealing a 59-point performance gap between the best and worst models. Researchers tested the models across 4,181 chats, assessing responses to direct agreement requests, requests for balanced arguments supporting harmful claims, and analysis of uploaded extremist documents.

The ADL report detailed that Claude scored 80 out of 100, proving most effective against anti-Jewish statements, whereas Grok scored only 21 overall. Grok showed a “complete failure” in summarizing documents containing problematic content and struggled significantly in multi-turn dialogues, suggesting context maintenance issues.

Daniel Kelley, senior director at the ADL Center for Technology and Society, explained the decision to emphasize positive findings in initial press materials. Kelley stated this choice was deliberate to showcase achievable safety standards rather than centering the narrative on the worst-performing systems, though Grok's low scores are fully documented in the main report.

Prior incidents have drawn attention to Grok’s moderation capabilities, including instances where the model generated antisemitic tropes after an update aimed at increasing political incorrectness. Furthermore, xAI owner Elon Musk has publicly endorsed the antisemitic great replacement theory and previously attacked the ADL itself.

Testing extended beyond religious bias to general extremist content, such as white supremacy statements, where researchers scored models based on refusal to engage and explanation of harm. The ADL concluded that all six evaluated models require substantial refinement to reliably detect and counter harmful content effectively.

For applications requiring robust content moderation, such as visual analysis or customer service, Grok’s documented weaknesses in image analysis and dialogue context suggest it needs “fundamental improvements across multiple dimensions,” according to the ADL assessment.

ADL Report Finds xAI's Grok Performed Worst in Resisting Antisemitic AI Prompts

Tags

Comments

Keep reading

More from AI

CFTC Forms New Innovation Task Force to Oversee Crypto, AI, and Prediction Markets

SoftBank to Build 10GW AI Datacenter on Former Nuclear Site in Ohio

Pearl Abyss Apologizes After Players Discover AI-Generated Art in Crimson Desert

Latest news

Bitcoin Tests $72,000 as Leverage Signals Volatility Amid Geopolitical Shifts

Players Unlock Howling Hill Expansion Dispatch System in Crimson Desert

Binance Mandates Market Maker Disclosure and Bans Profit-Sharing Arrangements