"OpenAIGym" (Reinforcement Learning Environments)
[Experimental]
"OpenAIGym" provides an interface to the Python OpenAI Gym reinforcement learning environments package.
The OpenAI Gym Python package is only officially supported on Linux and macOS platforms. Additionally, several different families of environments are available. Examples on this page use the "Atari" family of environments. Depending on the host system details, this family can be installed from the command line using one of the following (additional detailed installation instructions are available here):
$ pip install "gym[atari]"
$ pip3 install "gym[atari]"
$ pip install gym atari-py
ExternalEvaluate is used to interface with this Python package. It is up to the user to ensure that ExternalEvaluate is set up with a Python environment with Gym installed. A simple test to ensure that Gym is correctly set up is to run the following:
ExternalEvaluate["Python", "import gym"]
Examples
Basic Examples (2)Summary of the most common use cases
Open the "Atlantis-v0" Atari environment:
In[1]:=1

✖
https://wolfram.com/xid/0d2t83q23q3v678u5nh3gb-4d4cil
Out[1]=1

In[2]:=2

✖
https://wolfram.com/xid/0d2t83q23q3v678u5nh3gb-iudoxc
Out[2]=2

In this case, the "ObservedState" is an array of pixel values. Visualize this as an image:
In[3]:=3

✖
https://wolfram.com/xid/0d2t83q23q3v678u5nh3gb-rgq8j8
Out[3]=3

In[4]:=4

✖
https://wolfram.com/xid/0d2t83q23q3v678u5nh3gb-h6rdm8
Out[4]=4

In[5]:=5

✖
https://wolfram.com/xid/0d2t83q23q3v678u5nh3gb-mikjjd
Out[5]=5

Taking an action usually modifies the "ObservedState":
In[6]:=6

✖
https://wolfram.com/xid/0d2t83q23q3v678u5nh3gb-mvwn53
Out[6]=6

Reset the environment to an initial state. The initial "ObservedState" is returned:
In[7]:=7

✖
https://wolfram.com/xid/0d2t83q23q3v678u5nh3gb-9jaj26
Out[7]=7

In[8]:=8

✖
https://wolfram.com/xid/0d2t83q23q3v678u5nh3gb-pjakth
Open the "Breakout-v0" environment:
In[1]:=1

✖
https://wolfram.com/xid/0d2t83q23q3v678u5nh3gb-mygsbe
Out[1]=1

Visualize a random agent playing Breakout:
In[2]:=2

✖
https://wolfram.com/xid/0d2t83q23q3v678u5nh3gb-bci0ly
Out[2]=2

In[3]:=3

✖
https://wolfram.com/xid/0d2t83q23q3v678u5nh3gb-8e6xbw