ML does Mario


I came across a model that intrigued me enough to trial last night, a machine learning model that speed runs through Super Mario Brothers, cool eh?

Here’s the git uvipen/Super-mario-bros-PPO-pytorch

Here is my python source code for training an agent to play super mario bros. By using Proximal Policy Optimization (PPO) algorithm introduced in the paper Proximal Policy Optimization Algorithms paper.

Talking about performance, my PPO-trained agent could complete 29/32 levels, which is much better than what I expected at the beginning.

For your information, PPO is the algorithm proposed by OpenAI and used for training OpenAI Five, which is the first AI to beat the world champions in an esports game. Specifically, The OpenAI Five dispatched a team of casters and ex-pros with MMR rankings in the 99.95th percentile of Dota 2 players in August 2018.

I tried to install run the code with no avail, die to lack of a Nvidia graphics card. Knew I shouldn’t have gone for the AMD option, let’s hope I’ve learned my lesson… thinking I could have utilised a Nvidia card with my Shield too!

Anyway, I couldn’t get the code to work, so thought that I’d try the Docker option that the dev had kindly provided.

Now I’m only vaguely aware of what Docker does, so followed this great guide hosted on Digital Ocean of How To Install and Use Docker on Ubuntu 20.04

Great easy to follow guide, but still no luck on Nvidia-less machine. I’d noticed while installing the Docker image that gigs of data was downloaded, and I assumed saved; so looked to find it & free up some space.

docker images

So here’s what I wanted to share, a guide of How To Remove Docker Containers, Images, Volumes, and Networks

Now that’s all cleaned up I’m going to learn how to use Docker correctly, as I’m sure I have use for it with currently Raspberry Pi projects 🙂

Update: found this document useful: Install Docker Engine on Ubuntu