Skip to content

Grid Sensor #4399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Aug 26, 2020
Merged

Grid Sensor #4399

merged 20 commits into from
Aug 26, 2020

Conversation

J-Travnik
Copy link
Contributor

@J-Travnik J-Travnik commented Aug 20, 2020

Proposed change(s)

The Grid Sensor combines the generality of data extraction from Raycasts with the image processing power of Convolutional Neural Networks. The Grid Sensor can be used to collect data in the general form of a "Width x Height x Channel" matrix which can be used for training Reinforcement Learning agents or for data analysis.

Motivation

In MLAgents there are 2 main sensors for observing information that is "physically" around the agent.

Raycasts

Raycasts provide the agent the ability to see things along prespecified lines of sight, similar to LIDAR. The kind of data it can extract is open to the developer from things like:

  • The type of an object (enemy, npc, etc)
  • The health of a unit
  • the damage-per-second of a weapon on the ground

This is simple to implement and provides enough information for most simple games. When few are used, they are computationally fast. However, there are multiple limiting factors:

  • The rays need to be at the same height as the things the agent should observe
  • Objects can remain hidden by line of sight and if the knowledge of those objects is crucial to the success of the agent, then this limitation must be compensated for by the agents networks capacity (i.e., need a bigger brain with memory)
  • The order of the raycasts (one raycast being to the left/right of another) is thrown away at the model level and must be learned by the agent which extends training time. Multiple raycasts exacerbates this issue.
  • Typically the length of the raycasts is limited because the agent need not know about objects that are at the other side of the level. Combined with few raycasts for computational efficiency, this means that an agent may not observe objects that fall between these rays and the issue becomes worse as the objects reduce in size.

Camera

The Camera provides the agent with either a grayscale or an RGB image of the game environment. It goes without saying that there non-linear relationships between nearby pixels in an image. It is this intuition that helps form the basis of Convolutional Neural Networks (CNNs) and established the literature of designing networks that take advantage of these relationships between pixels. Following this established literature of CNNs on image based data, the MLAgent's Camera Sensor provides a means by which the agent can include high dimensional inputs (images) into its observation stream.
However the Camera Sensor has its own drawbacks as well.

  • It requires render the scene and thus is computationally slower than alternatives that do not use rendering
  • It has yet been shown that the Camera Sensor can be used on a headless machine which means it is not yet possible (if at all) to train an agent on a headless infrastructure.
  • If the textures of the important objects in the game are updated, the agent needs to be retrained.
  • The RGB of the camera only provides a maximum of 3 channels to the agent.

These limitations provided the motivation towards the development of the Grid Sensor and Grid Observations as described below.

Contribution

An image can be thought of as a matrix of a predefined width (W) and a height (H) and each pixel can be thought of as simply an array of length 3 (in the case of RGB), [Red, Green, Blue] holding the different channel information of the color (channel) intensities at that pixel location. Thus an image is just a 3 dimensional matrix of size WxHx3. A Grid Observation can be thought of as a generalization of this setup where in place of a pixel there is a "cell" which is an array of length N representing different channel intensities at that cell position.
From a Convolutional Neural Network point of view, the introduction of multiple channels in an "image" isn't a new concept. In fact the original inspiration for the Grid Sensor came from MinAtar which introduced a small suite of environments analogous to the Atari Learning Environment but where the representations were 10x10xn binary state representations. The distinction of Grid Observations is what the data within the channels represents. Instead of limiting the channels to color intensities, the channels within a cell of a Grid Observation generalize to any data that can be represented by a single number (float or int) such as the type of object within a cell or the value of a certain property.

Additionally, this PR also modifies the rpc_utils.py script to accept multiple pngs as was demonstrated in this unity hack week.

See docs/Grid-Sensor.md for further documentation.

Types of change(s)

  • New feature
  • Code refactor
  • Documentation update

Checklist

Other comments

The Grid Sensor was developed collaboration between Eidos Montreal and Matsuko.

Developers

  • Jaden Travnik
  • Charles Pearson
  • Martin Certicky
  • Erik Gajdos
  • Romain Trachel
  • Alexandre Peyrot

@CLAassistant
Copy link

CLAassistant commented Aug 20, 2020

CLA assistant check
All committers have signed the CLA.

@chriselion chriselion self-assigned this Aug 20, 2020
GridSensorDummyData dummyData;

// Use built-in tags
const string k_Tag1 = "Player";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the tags are specific to the project. So previously these tests were passing in our existing one but not a clean one. These tags are "built in" so they'll be present wherever the test is run.

* Initial version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO move to main changelog

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do after merging PR, to avoid conflicts

@chriselion
Copy link
Contributor

Note: cancelled the CircleCI tests since their equivalents are running on github actions.

@chriselion
Copy link
Contributor

Will merge as soon as yamato tests pass on #4409

Copy link
Contributor

@chriselion chriselion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I'll do a bit of additional cleanup after this is merged.

Thanks so much for contributing this!

@chriselion chriselion merged commit 4cb9168 into Unity-Technologies:master Aug 26, 2020
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants