Machine Learning Zone: OpenAI competition takes on Sonic the Hedgehog

Published April 5, 2018

Retro video games have been a useful platform for machine learning research for years, and the systems created have been creeping through the classics, mastering them as they go. Sonic the Hedgehog may be the next to fall: OpenAI has announced a competition to apply machine learning to the classic Sega game.

It’s not vastly different from what’s been attempted before, things like playing Super Mario Bros or Space Invaders, or even the likes of Doom. But the rules are a bit different here.

A very basic summary of how AIs learn to play something like Mario is this: an algorithm is set up with some basic capabilities like recognizing objects on screen and monitoring the in-game score. It’s then set free on the game itself and allowed access to the controls, with the sole goal of maximizing its score.

Over millions of tries the machine learns that in order to score, it needs to hit start first, then that it needs to move to the right, then that goombas kill it (and stop it from scoring more), coins give it points and so on. It does this all basically from recognizing the shapes on the screen or, in some cases, from accessing the game geometry and system memory directly — it doesn’t care about the Princess, and it may develop strange behaviors that result from its single-minded pursuit of incrementing its score integer.

This one, for example, learned that it can glitch through the walls to get ahead quickly:

Great job!

Another thing the OpenAI folks point out is that these systems often learn on the games and levels on which they are evaluated. It’s a sort of “teaching to the test” situation. So in the new competition, not only are the systems more complicated than Mario’s (as anyone who’s played Sonic can tell you), but the systems created will be tested on levels to which they’ve had limited exposure.

They won’t be going in blind — the risk of an AI breaking from the first is too high. But while researchers will have all the time in the world to design a training and learning mechanism based on a selection of Sonic levels, the test will involve applying that training mechanism to a new set of levels, under a strict time limit (18 hours of game time).

This means you have to create an agent that understands not just one level of Sonic, but Sonic as a gestalt. If your AI knows all the shortcuts in Green Valley Zone, it may excel there, but when sent to the Chemical Zone, it’ll choke (like me) when it encounters the scary underwater parts.

You don’t jump like normal! It’s a lot of pressure with the stuff coming up!

It also means your algorithm has to train efficiently, which may involve all kinds of techniques and shortcuts. Minimizing training time means minimizing lazy learning and paying attention to multiple sources of information at once.

There are also different control methods, gimmicks and physics in each game, so it may be that identifying those before making the run could be critical to success. Really, there are all kinds of things to consider. (It’s making me want to go back and play these great games.)

Contestants will be using OpenAI’s Gym Retro platform, which essentially wraps an emulator playing Sonic (and a set of other Sega games) in the tools developers need to extract data, map inputs and so on.

Winners don’t get any cash or anything, but first through third place will get trophies and will have the opportunity to co-author a report on the contest. OpenAI’s reports are interesting and widely read, so it sounds like a good opportunity if you have the time and inclination — although, of course, “it’s great exposure” is the classic payment avoidance strategy.

There are lots more games in the package of games OpenAI is using — I’d like to see an AI take on Gunstar Heroes, or Golden Axe III.