Read The Times Australia

Daily Bulletin

‘Are you joking, mate?’ AI doesn’t get sarcasm in non-American varieties of English

  • Written by: Aditya Joshi, Senior Lecturer, School of Computer Science and Engineering, UNSW Sydney
‘Are you joking, mate?’ AI doesn’t get sarcasm in non-American varieties of English

In 2018, my Australian co-worker asked me, “Hey, how are you going?”. My response – “I am taking a bus” – was met with a smirk. I had recently moved to Australia. Despite studying English for more than 20 years, it took me a while to familiarise myself with the Australian variety of the language.

It turns out large language models powered by artificial intelligence (AI) such as ChatGPT experience a similar problem.

In new research, published in the Findings of the Association for Computational Linguistics 2025, my colleagues and I introduce a new tool for evaluating the ability of different large language models to detect sentiment and sarcasm in three varieties of English: Australian English, Indian English and British English.

The results show there is still a long way to go until the promised benefits of AI are enjoyed by all, no matter the type or variety of language they speak.

Limited English

Large language models are often reported to achieve superlative performance on several standardised sets of tasks known as benchmarks.

The majority of benchmark tests are written in Standard American English. This implies that, while large language models are being aggressively sold by commercial providers, they have predominantly been tested – and trained – only on this one type of English.

This has major consequences.

For example, in a recent survey my colleagues and I found large language models are more likely to classify a text as hateful if it is written in the African-American variety of English. They also often “default” to Standard American English – even if the input is in other varieties of English, such as Irish English and Indian English.

To build on this research, we built BESSTIE.

What is BESSTIE?

BESSTIE is the first-of-its-kind benchmark for sentiment and sarcasm classification of three varieties of English: Australian English, Indian English and British English.

For our purposes, “sentiment” is the characteristic of the emotion: positive (the Aussie “not bad!”) or negative (“I hate the movie”). Sarcasm is defined as a form of verbal irony intended to express contempt or ridicule (“I love being ignored”).

To build BESSTIE, we collected two kinds of data: reviews of places on Google Maps and Reddit posts. We carefully curated the topics and employed language variety predictors – AI models specialised in detecting the language variety of a text. We selected texts that were predicted to be greater than 95% probability of a specific language variety.

The two steps (location filtering and language variety prediction) ensured the data represents the national variety, such as Australian English.

We then used BESSTIE to evaluate nine powerful, freely usable large language models, including RoBERTa, mBERT, Mistral, Gemma and Qwen.

Inflated claims

Overall, we found the large language models we tested worked better for Australian English and British English (which are native varieties of English) than the non-native variety of Indian English.

We also found large language models are better at detecting sentiment than they are at sarcasm.

Sarcasm is particularly challenging, not only as a linguistic phenomenon but also as a challenge for AI. For example, we found the models were able to detect sarcasm in Australian English only 62% of the time. This number was lower for Indian English and British English – about 57%.

These performances are lower than those claimed by the tech companies that develop large language models. For example, GLUE is a leaderboard that tracks how well AI models perform at sentiment classification on American English text.

The highest value is 97.5% for the model Turing ULR v6 and 96.7% for RoBERTa (from our suite of models) – both higher for American English than our observations for Australian, Indian and British English.

National context matters

As more and more people around the world use large language models, researchers and practitioners are waking up to the fact that these tools need to be evaluated for a specific national context.

For example, earlier this year the University of Western Australia along with Google launched a project to improve the efficacy of large language models for Aboriginal English.

Our benchmark will help evaluate future large language model techniques for their ability to detect sentiment and sarcasm. We’re also currently working on a project for large language models in emergency departments of hospitals to help patients with varying proficiencies of English.

Authors: Aditya Joshi, Senior Lecturer, School of Computer Science and Engineering, UNSW Sydney

Read more https://theconversation.com/are-you-joking-mate-ai-doesnt-get-sarcasm-in-non-american-varieties-of-english-254986

Business News

How Telematics Helps Australian Companies Improve Productivity

Operating a commercial fleet in Australia is a uniquely demanding endeavour. Between the sprawling urban sprawl of cities like Sydney and Melbourne and the immense, unforgiving stretches of the Outb...

Daily Bulletin - avatar Daily Bulletin

Inside the Icon: The BridgeMuseum Officially Opens at the Sydney Harbour Bridge

A bold new way to experience one of Australia’s most recognisable landmarks has arrived, with BridgeClimb Sydney officially opening the all-new BridgeMuseum.  Located inside the Sydney Harbour Brid...

Daily Bulletin - avatar Daily Bulletin

Is Your Brand Showing Up in AI Search? Most Melbourne Brands Aren't.

The New Front Door Nobody Told You About Something changed. Quietly. Without a press release. The way buyers find businesses in Australia has been rewired. Not replaced, rewired. Google isn't dead...

Daily Bulletin - avatar Daily Bulletin

How Australian Businesses Can Measure SEO ROI

SEO can feel vague when you are staring at a dashboard full of numbers that do not clearly connect to revenue. The key is to measure the right signals in the right order, then tie them back to outcome...

Daily Bulletin - avatar Daily Bulletin

How Commercial Roller Shutters Improve Site Security Without Slowing Operations

Security upgrades can be frustrating when they make everyday work harder. A door that takes too long to open, creates bottlenecks at shift change, or fails at the worst time can turn “better protectio...

Daily Bulletin - avatar Daily Bulletin

Why a Document Destruction Service Still Matters for Modern Businesses

Businesses generate large volumes of information every day, from staff records and contracts to invoices, reports and customer files. While attention often focuses on how documents are stored, the way...

Daily Bulletin - avatar Daily Bulletin

Bicycle Rack Safety and Space-Smart Storage

Bike storage problems usually show up as small annoyances first: tangled handlebars, scratched frames, and bikes that topple when you pull one out. Over time, those issues become safety risks, especia...

Daily Bulletin - avatar Daily Bulletin

How to Tell if a Childcare Centre Is a Good Fit for Your Child

Choosing childcare can feel like you’re making a huge decision with limited information. Tours are short, centres are often on their best behaviour, and your child might act differently in a new space...

Daily Bulletin - avatar Daily Bulletin

Car Import Timeline: What Usually Happens at Each Stage

Importing a car into Australia can feel confusing because multiple agencies and checkpoints are involved, and the timeline is shaped as much by paperwork quality as it is by shipping speed. The most u...

Daily Bulletin - avatar Daily Bulletin

The Daily Magazine

Gold Migration Lawyers in Liquidation: How the Closure Affects Your ART Appeal

If your appeal was with Gold Migration Lawyers, a recent change to how the Tribunal decides cases ...

The pressure cooker: life in urban Australia in 2026

Australian cities have always been demanding. Long commutes, rising housing costs, busy schedules a...

What Actually Makes a Good Criminal Lawyer in Melbourne

Most people only think about this question once. That is usually too late. Most people charged wi...

Why Working With A Chatswood Tutor Can Improve Academic Performance

Academic expectations continue increasing for students across primary school, high school, and senio...

Is It Worth Getting Solar Panels in Melbourne?

The real question is not whether solar works in Melbourne. It works. The question is what it is co...

How A Diploma Of Project Management Builds Practical Skills For Modern Work Environments

Developing the ability to plan, execute, and deliver outcomes efficiently is a key requirement in to...

How to Choose the Right Football for Every Level

Choosing a football may seem straightforward, but the right option depends on who will be using it a...

What to Ask a Wedding Photographer Before You Book

Booking a wedding photographer can feel deceptively simple: you like the photos, you like the vibe...

Why Stress Relief For Dogs Is Essential For Emotional Balance And Long-Term Wellbeing

Managing emotional health is just as important as physical care when it comes to pets, which is why ...