Skip to content

Sumtree sampling #60

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Sep 13, 2023
Merged

Conversation

CasBex
Copy link
Contributor

@CasBex CasBex commented Sep 8, 2023

For all the lofty talk about numerical rounding errors in #59, they are unavoidable even with the improved method. This fix simply checks whether the sampled priority happens to be zero, and if so it walks backwards over the leafs until it finds a nonzero priority node. If the backwards walk has not found anything, it performs a forward walk instead.

This has been tested against the JuliaRL_PrioritizedDQN_CartPole experiment in ReinforcementLearningExperiments.jl with 30 different seeds.

@jeremiahpslewis
Copy link
Member

Looks good! Can you add a test to this?

@CasBex CasBex mentioned this pull request Sep 11, 2023
@CasBex
Copy link
Contributor Author

CasBex commented Sep 11, 2023

Tests have been added. Feel free to change the tolerances/number of iterations... in case they take too long though. The first test checks that priority zero is never sampled; the second test checks that the pdf of samples is what we would expect. The latter however requires many samples so I've added some multithreading to speed it up. Both tests are run with 100 different seeds for the rng.

@CasBex
Copy link
Contributor Author

CasBex commented Sep 11, 2023

Sorry for the confusion with the tests. Should be good to go now

Copy link
Member

@jeremiahpslewis jeremiahpslewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the test! One more question and a couple of minor details

@codecov
Copy link

codecov bot commented Sep 12, 2023

Codecov Report

Merging #60 (3f3e99f) into main (85de617) will increase coverage by 0.32%.
The diff coverage is 53.33%.

@@            Coverage Diff             @@
##             main      #60      +/-   ##
==========================================
+ Coverage   73.21%   73.54%   +0.32%     
==========================================
  Files          15       15              
  Lines         743      756      +13     
==========================================
+ Hits          544      556      +12     
- Misses        199      200       +1     
Files Changed Coverage Δ
src/common/sum_tree.jl 81.60% <53.33%> (+1.87%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@findmyway findmyway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@HenriDeh HenriDeh dismissed jeremiahpslewis’s stale review September 13, 2023 13:03

Changes were included.

@HenriDeh
Copy link
Member

@CasBex I'll let you merge in case you want to make a last minute change.

@CasBex
Copy link
Contributor Author

CasBex commented Sep 13, 2023

I don't have permissions to merge @HenriDeh. Could you merge?

@jeremiahpslewis jeremiahpslewis merged commit c89ed6f into JuliaReinforcementLearning:main Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants