[Bug fix] Gym last reward before Done #3471

vincentpierre · 2020-02-19T00:25:53Z

Proposed change(s)

Second attempt at fixing gym.
This time we explicitly keep track of the agent_ids that gym has and their order.
When an agent is done, we replace its id in the current list with -1 and we wait for a new agent to take on the id.

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

#3460
MLA-651

Types of change(s)

Bug fix
~~[ ] New feature~~
Code refactor
~~[ ] Breaking change~~
~~[ ] Documentation update~~
~~[ ] Other (please describe)~~

Checklist

I have added tests that prove my fix is effective or that my feature works
~~[ ] I have added updated the changelog (if applicable)~~
~~[ ] I have added necessary documentation (if applicable)~~
~~[ ] I have updated the migration guide (if applicable)~~

Other comments

gym-unity/gym_unity/envs/__init__.py

ervteng · 2020-02-19T01:27:51Z

gym-unity/gym_unity/envs/__init__.py

@@ -121,6 +124,7 @@ def __init__(
        step_result = self._env.get_step_result(self.brain_name)
        self._check_agents(step_result.n_agents())
        self._previous_step_result = step_result
+        self._gym_id_order = list(self._previous_step_result.agent_id)


Is there a way we could make _gym_id_order a dict of agent_id to index instead of a List? That way we don't have to do O(N) index operations

Probably don't want to do this but it might work: https://stackoverflow.com/questions/1456373/two-way-reverse-map

Unfortunately, you would have to go both ways : id to index and index to id. Having a list id implicitly a dict from index to id.

I see so you'd have to do the two way dict to make it O(1) in both directions. If this code is only called when an agent is Done it might be OK. Could be horrific for 1000's of agents though.

I think it's worth trying to remove these index calls if possible. But getting tests first is more important, then you can optimize.

I added 2 tests for sanitize info

chriselion · 2020-02-19T03:36:19Z

Since this has broken twice now, I definitely needs some tests. It looks like none of the logic that was added previously has any coverage, and neither do most of the changes here.

gym-unity/gym_unity/envs/__init__.py

* encapsulate the agent mapping operations * rename, linear time impl * cleanup * dict.popitem * udpate comments

gym-unity/gym_unity/tests/test_gym.py

* Fixing #3460 * Addressing comments * Added 2 tests * encapsulate the agent mapping operations (#3481) * encapsulate the agent mapping operations * rename, linear time impl * cleanup * dict.popitem * udpate comments * Update gym-unity/gym_unity/tests/test_gym.py Co-authored-by: Chris Elion <[email protected]>

Fixing #3460

b19e7fd

vincentpierre requested review from ervteng and chriselion February 19, 2020 00:25

vincentpierre self-assigned this Feb 19, 2020

vincentpierre changed the title ~~Fixing #3460~~ [Bug fix] Gym last reward before Done Feb 19, 2020

chriselion reviewed Feb 19, 2020

View reviewed changes

gym-unity/gym_unity/envs/__init__.py Outdated Show resolved Hide resolved

Addressing comments

322ce3e

ervteng reviewed Feb 19, 2020

View reviewed changes

chriselion reviewed Feb 19, 2020

View reviewed changes

gym-unity/gym_unity/envs/__init__.py Outdated Show resolved Hide resolved

chriselion reviewed Feb 19, 2020

View reviewed changes

gym-unity/gym_unity/envs/__init__.py Show resolved Hide resolved

chriselion reviewed Feb 19, 2020

View reviewed changes

gym-unity/gym_unity/envs/__init__.py Show resolved Hide resolved

vincentpierre and others added 2 commits February 19, 2020 11:13

Added 2 tests

8cceca0

encapsulate the agent mapping operations (#3481)

4885920

* encapsulate the agent mapping operations * rename, linear time impl * cleanup * dict.popitem * udpate comments

chriselion reviewed Feb 24, 2020

View reviewed changes

gym-unity/gym_unity/tests/test_gym.py Outdated Show resolved Hide resolved

Update gym-unity/gym_unity/tests/test_gym.py

2076fd0

chriselion approved these changes Feb 24, 2020

View reviewed changes

vincentpierre merged commit ea54e4a into master Feb 24, 2020

delete-merged-branch bot deleted the develop-fix-gym-last-reward branch February 24, 2020 20:16

harperj mentioned this pull request Mar 11, 2020

gym_unity seems to provide a reward of 0.0 for the final step #3460

Closed

github-actions bot locked as resolved and limited conversation to collaborators May 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug fix] Gym last reward before Done #3471

[Bug fix] Gym last reward before Done #3471

Uh oh!

vincentpierre commented Feb 19, 2020 •

edited

Loading

Uh oh!

Uh oh!

ervteng Feb 19, 2020

Uh oh!

ervteng Feb 19, 2020

Uh oh!

vincentpierre Feb 19, 2020

Uh oh!

ervteng Feb 19, 2020

Uh oh!

chriselion Feb 19, 2020

Uh oh!

vincentpierre Feb 19, 2020

Uh oh!

chriselion commented Feb 19, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Bug fix] Gym last reward before Done #3471

[Bug fix] Gym last reward before Done #3471

Uh oh!

Conversation

vincentpierre commented Feb 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed change(s)

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

Checklist

Other comments

Uh oh!

Uh oh!

ervteng Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

ervteng Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

vincentpierre Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

ervteng Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

chriselion Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

vincentpierre Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

chriselion commented Feb 19, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vincentpierre commented Feb 19, 2020 •

edited

Loading