Read The Times Australia

Daily Bulletin

I got generative AI to attempt an undergraduate law exam. It struggled with complex questions

  • Written by: Armin Alimardani, Lecturer, School of Law, University of Wollongong
I got generative AI to attempt an undergraduate law exam. It struggled with complex questions

It’s been nearly two years since generative artificial intelligence was made widely available to the public. Some models showed great promise by passing academic and professional exams.

For instance, GPT-4 scored higher than 90% of the United States bar exam test takers. These successes led to concerns AI systems might also breeze through university-level assessments. However, my recent study paints a different picture, showing it isn’t quite the academic powerhouse some might think it is.

My study

To explore generative AI’s academic abilities, I looked at how it performed on an undergraduate criminal law final exam at the University of Wollongong – one of the core subjects students need to pass in their degrees. There were 225 students doing the exam.

The exam was for three hours and had two sections. The first asked students to evaluate a case study about criminal offences – and the likelihood of a successful prosecution. The second included a short essay and a set of short-answer questions.

The test questions evaluated a mix of skills, including legal knowledge, critical thinking and the ability to construct persuasive arguments.

Students were not allowed to use AI for their responses and did the assessment in a supervised environment.

I used different AI models to create ten distinct answers to the exam questions.

Five papers were generated by just pasting the exam question into the AI tool without any prompts. For the other five, I gave detailed prompts and relevant legal content to see if that would improve the outcome.

I hand wrote the AI-generated answers in official exam booklets and used fake student names and numbers. These AI-generated answers were mixed with actual student exam answers and anonymously given to five tutors for grading.

Importantly, when marking, the tutors did not know AI had generated ten of the exam answers.

A man writes on a sheet of paper.
We handwrote the AI answers so markers would think they were done by students. Kate Aedon/Shutterstock

How did the AI papers perform?

When the tutors were interviewed after marking, none of them suspected any answers were AI-generated.

This shows the potential for AI to mimic student responses and educators’ inability to spot such papers.

But on the whole, the AI papers were not impressive.

While the AI did well in the essay-style question, it struggled with complex questions that required in-depth legal analysis.

This means even though AI can mimic human writing style, it lacks the nuanced understanding needed for complex legal reasoning.

The students’ exam average was 66%.

The AI papers that had no prompting, on average, only beat 4.3% of students. Two barely passed (the pass mark is 50%) and three failed.

In terms of the papers where prompts were used, on average, they beat 39.9% of students. Three of these papers weren’t impressive and received 50%, 51.7% and 60%, but two did quite well. One scored 73.3% and the other scored 78%.

A landing page for ChatGPT, asking 'How can I help you today?'
Generative AI has gained a reputation for passing difficult exams. Tada Images/ Shutterstock

What does this mean?

These findings have important implications for both education and professional standards.

Despite the hype, generative AI isn’t close to replacing humans in intellectually demanding tasks such as this law exam.

My study suggests AI should be viewed more like a tool, and when used properly, it can enhance human capabilities.

So schools and universities should concentrate on developing students’ skills to collaborate with AI and analyse its outputs critically, rather than relying on the tools’ ability to simply spit out answers.

Further, to make collaboration between AI and students possible, we may have to rethink some of the traditional notions we have about education and assessment.

For example, we might consider when a student prompts, verifies and edits an AI-generated work, that is their original contribution and should still be viewed as a valuable part of learning.

Authors: Armin Alimardani, Lecturer, School of Law, University of Wollongong

Read more https://theconversation.com/i-got-generative-ai-to-attempt-an-undergraduate-law-exam-it-struggled-with-complex-questions-240021

Business News

How Telematics Helps Australian Companies Improve Productivity

Operating a commercial fleet in Australia is a uniquely demanding endeavour. Between the sprawling urban sprawl of cities like Sydney and Melbourne and the immense, unforgiving stretches of the Outb...

Daily Bulletin - avatar Daily Bulletin

Inside the Icon: The BridgeMuseum Officially Opens at the Sydney Harbour Bridge

A bold new way to experience one of Australia’s most recognisable landmarks has arrived, with BridgeClimb Sydney officially opening the all-new BridgeMuseum.  Located inside the Sydney Harbour Brid...

Daily Bulletin - avatar Daily Bulletin

Is Your Brand Showing Up in AI Search? Most Melbourne Brands Aren't.

The New Front Door Nobody Told You About Something changed. Quietly. Without a press release. The way buyers find businesses in Australia has been rewired. Not replaced, rewired. Google isn't dead...

Daily Bulletin - avatar Daily Bulletin

How Australian Businesses Can Measure SEO ROI

SEO can feel vague when you are staring at a dashboard full of numbers that do not clearly connect to revenue. The key is to measure the right signals in the right order, then tie them back to outcome...

Daily Bulletin - avatar Daily Bulletin

How Commercial Roller Shutters Improve Site Security Without Slowing Operations

Security upgrades can be frustrating when they make everyday work harder. A door that takes too long to open, creates bottlenecks at shift change, or fails at the worst time can turn “better protectio...

Daily Bulletin - avatar Daily Bulletin

Why a Document Destruction Service Still Matters for Modern Businesses

Businesses generate large volumes of information every day, from staff records and contracts to invoices, reports and customer files. While attention often focuses on how documents are stored, the way...

Daily Bulletin - avatar Daily Bulletin

Bicycle Rack Safety and Space-Smart Storage

Bike storage problems usually show up as small annoyances first: tangled handlebars, scratched frames, and bikes that topple when you pull one out. Over time, those issues become safety risks, especia...

Daily Bulletin - avatar Daily Bulletin

How to Tell if a Childcare Centre Is a Good Fit for Your Child

Choosing childcare can feel like you’re making a huge decision with limited information. Tours are short, centres are often on their best behaviour, and your child might act differently in a new space...

Daily Bulletin - avatar Daily Bulletin

Car Import Timeline: What Usually Happens at Each Stage

Importing a car into Australia can feel confusing because multiple agencies and checkpoints are involved, and the timeline is shaped as much by paperwork quality as it is by shipping speed. The most u...

Daily Bulletin - avatar Daily Bulletin

The Daily Magazine

Gold Migration Lawyers in Liquidation: How the Closure Affects Your ART Appeal

If your appeal was with Gold Migration Lawyers, a recent change to how the Tribunal decides cases ...

The pressure cooker: life in urban Australia in 2026

Australian cities have always been demanding. Long commutes, rising housing costs, busy schedules a...

What Actually Makes a Good Criminal Lawyer in Melbourne

Most people only think about this question once. That is usually too late. Most people charged wi...

Why Working With A Chatswood Tutor Can Improve Academic Performance

Academic expectations continue increasing for students across primary school, high school, and senio...

Is It Worth Getting Solar Panels in Melbourne?

The real question is not whether solar works in Melbourne. It works. The question is what it is co...

How A Diploma Of Project Management Builds Practical Skills For Modern Work Environments

Developing the ability to plan, execute, and deliver outcomes efficiently is a key requirement in to...

How to Choose the Right Football for Every Level

Choosing a football may seem straightforward, but the right option depends on who will be using it a...

What to Ask a Wedding Photographer Before You Book

Booking a wedding photographer can feel deceptively simple: you like the photos, you like the vibe...

Why Stress Relief For Dogs Is Essential For Emotional Balance And Long-Term Wellbeing

Managing emotional health is just as important as physical care when it comes to pets, which is why ...