Read The Times Australia

Daily Bulletin

Google’s new Go-playing AI learns fast, and even thrashed its former self

  • Written by: Geoff Goodhill, Professor of Neuroscience and Mathematics, The University of Queensland

Just last year Google DeepMind’s AlphaGo took the world of Artificial Intelligence (AI) by storm, showing that a computer program could beat the world’s best human Go players.

But in a demonstration of the feverish rate of progress in modern AI, details of a new milestone reached by an improved version called AlphaGo Zero were published this week in Nature.

Using less computing power and only three days of training time, AlphaGo Zero beat the original AlphaGo in a 100-game match by 100 to 0. It wasn’t even worth humans showing up.

Read more: Why Google wants to think more like you and less like a machine

Learning to play Go

Go is a game of strategy between two players who take it in turns to place “stones” on a 19x19 board. The goal is to surround a larger area of the board than your opponent.

image The game of Go, simple to learn but a lifetime to master… for a human. Paragorn Dangsombroon/Shutterstock

Go has proved much more challenging than chess for computers to master. There are many more possible moves in each position in Go than chess, and many more possible games.

The original AlphaGo first learned from studying 30 million moves of expert human play. It then improved beyond human expertise by playing many games against itself, taking several months of computer time.

By contrast, AlphaGo Zero never saw humans play. Instead, it began by knowing only the rules of the game. From a relatively modest five million games of self-play, taking only three days on a smaller computer than the original AlphaGo, it then learned super-AlphaGo performance.

Fascinatingly, its learning roughly mimicked some of the stages through which humans progress as they master Go. AlphaGo Zero rapidly learned to reject naively short-term goals and developed more strategic thinking, generating many of the patterns of moves often used by top-level human experts.

But remarkably it then started rejecting some of these patterns in favour of new strategies never seen before in human play.

Beyond human play

AlphaGo Zero achieved this feat by approaching the problem differently from the original AlphaGo. Both versions use a combination of two of the most powerful algorithms currently fuelling AI: deep learning and reinforcement learning.

To play a game like Go, there are two basic things the program needs to learn. The first is a policy: the probability of making each of the possible moves in a given position. The second is a value: the probability of winning from any given position.

In the pure reinforcement learning approach of AlphaGo Zero, the only information available to learn policies and values was for it to predict who might ultimately win. To make this prediction it used its current policy and values, but at the start these were random.

This is clearly a more challenging approach than the original AlphaGo, which used expert human moves to get a head-start on learning. But the earlier version learned policies and values with separate neural networks.

The algorithmic breakthrough in AlphaGo Zero was to figure out how these could be combined in just one network. This allowed the process of training by self-play to be greatly simplified, and made it feasible to start from a clean slate rather than first learning what expert humans would do.

How AlphaGo Zero learned to master Go.

An Elo rating is a widely used measure of the performance of players in games such as Go and chess. The best human player so far, Ke Jie, currently has an Elo rating of about 3,700.

AlphaGo Zero trained for three days and achieved an Elo rating of more than 4,000, while an expanded version of the same algorithm trained for 40 days and achieved almost 5,200.

This is an astonishingly large step up from the best human – far bigger than the current gap between the best human chess player Magnus Carlsen (about 2,800) and chess program (about 3,400).

The next challenge

AlphaGo Zero is an important step forward for AI because it demonstrates the feasibility of pure reinforcement learning, uncorrupted by any human guidance. This removes the need for lots of expert human knowledge to get started, which in some domains can be hard to obtain.

It also means the algorithm is free to develop completely new approaches that might have been much harder to find had it been been initially constrained to “think inside the human box”. Remarkably, this strategy also turns out to be more computationally efficient.

But Go is a tightly constrained game of perfect information, without the messiness of most real-world problems. Training AlphaGo Zero required the accurate simulation of millions of games, following the rules of Go.

For many practical problems such simulations are computationally unfeasible, or the rules themselves are less clear.

Read more: No more playing games: AlphaGo AI to tackle some real world challenges

There are still many further problems to be solved to create a general-purpose AI, one that can tackle a wide range of practical problems without domain-specific human intervention.

But even though humans have now comprehensively lost the battle with Go algorithms, luckily AI (unlike Go) is not a zero-sum game. Many of AlphaGo Zero’s games have now been published, providing a lifetime of inspirational study for human Go players.

More importantly, AlphaGo Zero represents a step towards a world where humans can harness powerful AIs to help find unimaginably (to humans) creative solutions to difficult problems. In the world of AI, there has never been a better time to Go for it.

Authors: Geoff Goodhill, Professor of Neuroscience and Mathematics, The University of Queensland

Read more http://theconversation.com/googles-new-go-playing-ai-learns-fast-and-even-thrashed-its-former-self-85979

Business News

Is Your Brand Showing Up in AI Search? Most Melbourne Brands Aren't.

The New Front Door Nobody Told You About Something changed. Quietly. Without a press release. The way buyers find businesses in Australia has been rewired. Not replaced, rewired. Google isn't dead...

Daily Bulletin - avatar Daily Bulletin

How Australian Businesses Can Measure SEO ROI

SEO can feel vague when you are staring at a dashboard full of numbers that do not clearly connect to revenue. The key is to measure the right signals in the right order, then tie them back to outcome...

Daily Bulletin - avatar Daily Bulletin

How Commercial Roller Shutters Improve Site Security Without Slowing Operations

Security upgrades can be frustrating when they make everyday work harder. A door that takes too long to open, creates bottlenecks at shift change, or fails at the worst time can turn “better protectio...

Daily Bulletin - avatar Daily Bulletin

Why a Document Destruction Service Still Matters for Modern Businesses

Businesses generate large volumes of information every day, from staff records and contracts to invoices, reports and customer files. While attention often focuses on how documents are stored, the way...

Daily Bulletin - avatar Daily Bulletin

Bicycle Rack Safety and Space-Smart Storage

Bike storage problems usually show up as small annoyances first: tangled handlebars, scratched frames, and bikes that topple when you pull one out. Over time, those issues become safety risks, especia...

Daily Bulletin - avatar Daily Bulletin

How to Tell if a Childcare Centre Is a Good Fit for Your Child

Choosing childcare can feel like you’re making a huge decision with limited information. Tours are short, centres are often on their best behaviour, and your child might act differently in a new space...

Daily Bulletin - avatar Daily Bulletin

Car Import Timeline: What Usually Happens at Each Stage

Importing a car into Australia can feel confusing because multiple agencies and checkpoints are involved, and the timeline is shaped as much by paperwork quality as it is by shipping speed. The most u...

Daily Bulletin - avatar Daily Bulletin

Portable Toilet Hygiene Standards Explained: Clean vs Sanitised vs Disinfected

In portable toilet servicing, the words clean, sanitised, and disinfected often get used as if they mean the same thing. They don’t. And that difference matters because a unit can look tidy and still ...

Daily Bulletin - avatar Daily Bulletin

Options Available When a Company Faces Financial Distress

Financial distress can develop gradually or arrive suddenly, and when it does, the decisions made in the early stages often determine what options remain available later. Directors who act promptly ...

Daily Bulletin - avatar Daily Bulletin

The Daily Magazine

What Actually Makes a Good Criminal Lawyer in Melbourne

Most people only think about this question once. That is usually too late. Most people charged wi...

Why Working With A Chatswood Tutor Can Improve Academic Performance

Academic expectations continue increasing for students across primary school, high school, and senio...

Is It Worth Getting Solar Panels in Melbourne?

The real question is not whether solar works in Melbourne. It works. The question is what it is co...

How A Diploma Of Project Management Builds Practical Skills For Modern Work Environments

Developing the ability to plan, execute, and deliver outcomes efficiently is a key requirement in to...

How to Choose the Right Football for Every Level

Choosing a football may seem straightforward, but the right option depends on who will be using it a...

What to Ask a Wedding Photographer Before You Book

Booking a wedding photographer can feel deceptively simple: you like the photos, you like the vibe...

Why Stress Relief For Dogs Is Essential For Emotional Balance And Long-Term Wellbeing

Managing emotional health is just as important as physical care when it comes to pets, which is why ...

Australia’s Best Walking Trails and the Shoes You Need to Tackle Them

Australia is not short on spectacular walks. You can follow ocean cliffs in Victoria, cross ancien...

Why Pre-Purchase Building Inspections Are Essential Before Buying a Home in Australia

source Have you ever walked through an open home and started picturing your furniture, family d...