Normalizer Wrapper #12

HenriDeh · 2022-05-11T12:28:58Z

Hello,

I made the Normalizer that I described in #7. It is fairly straightforward to use but I think documentation will be welcome once this package is integrated to RL.jl. I'm opening an issue as a reminder to do that in due time. Tell me what you think.

Closes #7

HenriDeh · 2022-05-11T13:16:24Z

In #7 I mention that this may pose problem for NStepSampler. With this implementation, NStepSampler will not work properly.
I have two propositions to deal with that:

We can make a discounted_sum_of_rewards_normalizer: by simply using a FTSeries wrapper to multiply rewards by (1-gamma^n)/(1-gamma) before fitting. The problem with this solution is that if a Trajectory has multiple samplers (remember we discussed the metasampler), say a NStepBatchSampler and a BatchSampler, it is now the latter that will be wrong. So this is not my favourite option.
What I think we should do is completely change NStepBatchSampler. To me, NStepBatchSampler should sample N consecutive experiences of all traces it is asked for, and that's it. Doing the discounted sum of rewards should be done by the algorithm, for example in q_targets.

findmyway · 2022-05-11T15:06:19Z

Indeed, NStepBatchSampler should be changed in the next version.

src/normalization.jl

Project.toml

HenriDeh · 2022-05-19T10:00:23Z

I think I reached something close to a final API. Before merging, I still need to check how this fares with ElasticArrays and Episodes. But I need to learn how they work first.

codecov-commenter · 2022-05-19T10:00:25Z

Codecov Report

Merging #12 (9e26766) into main (9095619) will decrease coverage by 1.10%.
The diff coverage is 60.49%.

@@            Coverage Diff             @@
##             main      #12      +/-   ##
==========================================
- Coverage   68.36%   67.25%   -1.11%     
==========================================
  Files           9       10       +1     
  Lines         373      452      +79     
==========================================
+ Hits          255      304      +49     
- Misses        118      148      +30

Impacted Files	Coverage Δ
src/rendering.jl	`0.00% <0.00%> (ø)`
src/traces.jl	`85.23% <0.00%> (ø)`
src/normalization.jl	`61.53% <61.53%> (ø)`
src/samplers.jl	`86.66% <100.00%> (ø)`
src/trajectory.jl	`74.07% <0.00%> (+1.85%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9095619...9e26766. Read the comment docs.

src/normalization.jl

src/traces.jl

findmyway · 2022-05-19T10:19:30Z

Great!

I think after merging this, we'd better register this package first? So that RLCore and RLZoo could be updated to use this package.

(Kind of busy recently, but should have plenty of time to work on it in the following several weeks)

HenriDeh · 2022-05-19T10:43:25Z

It's not an essential part of the package anyways. If you want to use Trajectories sooner it's fine. What I'm more concerned about is for the algorithms that need the meta sampler.

HenriDeh · 2022-05-31T10:42:40Z

So I eventually went with a NormalizedTraces (with a s) wrapper. The reason is that otherwise we cannot use it with preconstructed traces such as CircularSARTTraces. I suspect this would have been problematic with Episodes too.
I removed fetch to settle with a simple overload of sample(::BatchSampler). Note that if we add a multistep batch sampler, we will have to implement a specific method for normalization.

One question that we might want to address: currently, the two names of MultiplexTrace share the same normalizer. Meaning that pushing a next_state will update the state normalizer and sampled next_state's will be normalized (as we want).
We must keep that in mind to avoid estimation errors. Currently in RL.jl, pushing a next_state is not a thing so it should be ok.
The other failure case is when pushing dummy states, this should be avoided when using normalization. Since this is already a problem for non-episodic environments (as discussed in #621).

This may be ready to merge. Though I have not tested if it works well with Episodes.

findmyway · 2022-05-31T11:11:55Z

Great!

We can merge this first and then address some other corner cases with Episodes later.

HenriDeh added 3 commits May 11, 2022 14:26

Normalizer

9a589fb

typo

a47851a

typo

bd21824

HenriDeh added 6 commits May 12, 2022 11:13

fix length

2db7ec3

fix

9130fcd

doc

ed9c8ef

Add a test for Float32

dea0cca

Remove scalar normalization

85db627

adapt test

f4ec153

findmyway reviewed May 12, 2022

View reviewed changes

src/normalization.jl Outdated Show resolved Hide resolved

Project.toml Show resolved Hide resolved

HenriDeh added 7 commits May 19, 2022 11:37

add rendering

9c74006

use normalizedtrace instead of trajectory

70d8623

use a fetch api

c3aaf95

move include order

7eb77fe

tests

bc73f34

compat

1f04c11

fix tests and type presevation

a84058a

findmyway approved these changes May 19, 2022

View reviewed changes

src/normalization.jl Outdated Show resolved Hide resolved

src/traces.jl Outdated Show resolved Hide resolved

fix a test

421fa78

HenriDeh added 2 commits May 19, 2022 12:45

remove the deepcopy

9753cf1

increase test batchsize

564cb5b

findmyway mentioned this pull request May 22, 2022

Unify the definition of AbstractTraces #14

Merged

HenriDeh added 2 commits May 30, 2022 15:05

Merge branch 'main' into normalization

1a0b458

move to NormalizedTraces

b142710

HenriDeh added 6 commits May 31, 2022 11:13

remove fetch and subtype

3d0416d

add pretty printing

7911566

Improve doc

18d2b97

improve some docs

ab2df32

Fix multiplex

561a5f0

fix test typo

9e26766

findmyway merged commit 8c6b304 into main May 31, 2022

HenriDeh deleted the normalization branch May 31, 2022 11:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Normalizer Wrapper #12

Normalizer Wrapper #12

Uh oh!

HenriDeh commented May 11, 2022

Uh oh!

HenriDeh commented May 11, 2022

Uh oh!

findmyway commented May 11, 2022

Uh oh!

Uh oh!

Uh oh!

HenriDeh commented May 19, 2022

Uh oh!

codecov-commenter commented May 19, 2022 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

findmyway commented May 19, 2022

Uh oh!

HenriDeh commented May 19, 2022

Uh oh!

HenriDeh commented May 31, 2022

Uh oh!

findmyway commented May 31, 2022

Uh oh!

Uh oh!

Normalizer Wrapper #12

Normalizer Wrapper #12

Uh oh!

Conversation

HenriDeh commented May 11, 2022

Uh oh!

HenriDeh commented May 11, 2022

Uh oh!

findmyway commented May 11, 2022

Uh oh!

Uh oh!

Uh oh!

HenriDeh commented May 19, 2022

Uh oh!

codecov-commenter commented May 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

findmyway commented May 19, 2022

Uh oh!

HenriDeh commented May 19, 2022

Uh oh!

HenriDeh commented May 31, 2022

Uh oh!

findmyway commented May 31, 2022

Uh oh!

Uh oh!

codecov-commenter commented May 19, 2022 •

edited

Loading