AI Fails to Understand Sarcasm in Non-American English Varieties

AI Fails to Understand Sarcasm in Non-American English Varieties AI Fails to Understand Sarcasm in Non-American English Varieties

BESSTIE benchmark exposes large language models stumbling on Aussie, Indian, and British English sarcasm and sentiment.

Researchers launched BESSTIE to test sentiment and sarcasm detection across these English varieties. They pulled real data from Google Maps reviews and Reddit, using AI-powered language variety predictors to ensure authenticity. The goal: see if top models like RoBERTa, mBERT, Mistral, Gemma, and Qwen handle non-US English properly.

The verdict? Models perform better on Australian and British English than Indian English—but sarcasm detection tanks. Australian sarcasm hit only 62%, Indian and British slumped to about 57%. Sentiment detection fared better but still lags behind American English benchmarks.

Advertisement

The researchers point out that current AI benchmarks mainly test US English, leaving other English variations underserved. They highlight earlier findings showing models misclassify African-American English and often default to Standard American English regardless of input variety.

“Large language models are more likely to classify a text as hateful if it is written in the African-American variety of English. They also often “default” to Standard American English – even if the input is in other varieties of English, such as Irish English and Indian English.”

— Lead researcher comment on model bias

The highest sentiment detection on US English is 97.5% (Turing ULR v6) and 96.7% (RoBERTa), far above BESSTIE’s results for other English variants.

The launch follows efforts like the University of Western Australia and Google’s project to improve AI for Aboriginal English.

BESSTIE’s creators are also working on AI tools for bilingual patients in emergency departments, aiming to improve language understanding in critical healthcare settings.

This study underscores a big problem: AI’s English isn’t universal yet. The tech still struggles to grasp the global English it’s supposed to serve.

Add a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Advertisement