-
Notifications
You must be signed in to change notification settings - Fork 4.3k
[Bug fix] Gym last reward before Done #3471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
gym-unity/gym_unity/envs/__init__.py
Outdated
@@ -121,6 +124,7 @@ def __init__( | |||
step_result = self._env.get_step_result(self.brain_name) | |||
self._check_agents(step_result.n_agents()) | |||
self._previous_step_result = step_result | |||
self._gym_id_order = list(self._previous_step_result.agent_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way we could make _gym_id_order a dict of agent_id to index instead of a List? That way we don't have to do O(N) index operations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably don't want to do this but it might work: https://stackoverflow.com/questions/1456373/two-way-reverse-map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, you would have to go both ways : id to index and index to id. Having a list id implicitly a dict from index to id.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see so you'd have to do the two way dict to make it O(1) in both directions. If this code is only called when an agent is Done it might be OK. Could be horrific for 1000's of agents though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth trying to remove these index calls if possible. But getting tests first is more important, then you can optimize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added 2 tests for sanitize info
Since this has broken twice now, I definitely needs some tests. It looks like none of the logic that was added previously has any coverage, and neither do most of the changes here. |
* encapsulate the agent mapping operations * rename, linear time impl * cleanup * dict.popitem * udpate comments
* Fixing #3460 * Addressing comments * Added 2 tests * encapsulate the agent mapping operations (#3481) * encapsulate the agent mapping operations * rename, linear time impl * cleanup * dict.popitem * udpate comments * Update gym-unity/gym_unity/tests/test_gym.py Co-authored-by: Chris Elion <[email protected]>
Proposed change(s)
Second attempt at fixing gym.
This time we explicitly keep track of the agent_ids that gym has and their order.
When an agent is done, we replace its id in the current list with -1 and we wait for a new agent to take on the id.
Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)
#3460
MLA-651
Types of change(s)
[ ] New feature[ ] Breaking change[ ] Documentation update[ ] Other (please describe)Checklist
[ ] I have added updated the changelog (if applicable)[ ] I have added necessary documentation (if applicable)[ ] I have updated the migration guide (if applicable)Other comments