Skip to content

[Bug fix] Gym last reward before Done #3471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 24, 2020

Conversation

vincentpierre
Copy link
Contributor

@vincentpierre vincentpierre commented Feb 19, 2020

Proposed change(s)

Second attempt at fixing gym.
This time we explicitly keep track of the agent_ids that gym has and their order.
When an agent is done, we replace its id in the current list with -1 and we wait for a new agent to take on the id.

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

#3460
MLA-651

Types of change(s)

  • Bug fix
  • [ ] New feature
  • Code refactor
  • [ ] Breaking change
  • [ ] Documentation update
  • [ ] Other (please describe)

Checklist

  • I have added tests that prove my fix is effective or that my feature works
  • [ ] I have added updated the changelog (if applicable)
  • [ ] I have added necessary documentation (if applicable)
  • [ ] I have updated the migration guide (if applicable)

Other comments

@vincentpierre vincentpierre self-assigned this Feb 19, 2020
@vincentpierre vincentpierre changed the title Fixing #3460 [Bug fix] Gym last reward before Done Feb 19, 2020
@@ -121,6 +124,7 @@ def __init__(
step_result = self._env.get_step_result(self.brain_name)
self._check_agents(step_result.n_agents())
self._previous_step_result = step_result
self._gym_id_order = list(self._previous_step_result.agent_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we could make _gym_id_order a dict of agent_id to index instead of a List? That way we don't have to do O(N) index operations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably don't want to do this but it might work: https://stackoverflow.com/questions/1456373/two-way-reverse-map

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, you would have to go both ways : id to index and index to id. Having a list id implicitly a dict from index to id.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see so you'd have to do the two way dict to make it O(1) in both directions. If this code is only called when an agent is Done it might be OK. Could be horrific for 1000's of agents though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth trying to remove these index calls if possible. But getting tests first is more important, then you can optimize.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added 2 tests for sanitize info

@chriselion
Copy link
Contributor

Since this has broken twice now, I definitely needs some tests. It looks like none of the logic that was added previously has any coverage, and neither do most of the changes here.

vincentpierre and others added 2 commits February 19, 2020 11:13
* encapsulate the agent mapping operations

* rename, linear time impl

* cleanup

* dict.popitem

* udpate comments
@vincentpierre vincentpierre merged commit ea54e4a into master Feb 24, 2020
@delete-merged-branch delete-merged-branch bot deleted the develop-fix-gym-last-reward branch February 24, 2020 20:16
anupam-142857 pushed a commit that referenced this pull request Feb 26, 2020
* Fixing #3460

* Addressing comments

* Added 2 tests

* encapsulate the agent mapping operations (#3481)

* encapsulate the agent mapping operations

* rename, linear time impl

* cleanup

* dict.popitem

* udpate comments

* Update gym-unity/gym_unity/tests/test_gym.py

Co-authored-by: Chris Elion <[email protected]>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants