Skip to content

n and p parametrization on Zero Inflated Negative Binomial #5212

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 21, 2021

Conversation

farhanreynaldo
Copy link
Contributor

This PR added alternative parametrization (n and p) on Zero Inflated Negative Binomial based on this issue #5196.

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just did a quick skim. Looks great so far

@@ -1502,6 +1502,40 @@ def seeded_zero_inflated_negbinomial_rng_fn(self):
]


class TestZeroInflatedNegativeBinomial(BaseTestDistribution):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test can be much smaller. Check other tests for alternative parametrizations in this module. Like NormalTau

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a quick glance on NormalTau, we only need to test on check_pymc_params_match_rv_op?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@codecov
Copy link

codecov bot commented Nov 20, 2021

Codecov Report

Merging #5212 (3038835) into main (a5b13d4) will increase coverage by 0.96%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5212      +/-   ##
==========================================
+ Coverage   77.98%   78.94%   +0.96%     
==========================================
  Files          88       88              
  Lines       14215    14248      +33     
==========================================
+ Hits        11085    11248     +163     
+ Misses       3130     3000     -130     
Impacted Files Coverage Δ
pymc/distributions/discrete.py 98.35% <100.00%> (+<0.01%) ⬆️
pymc/step_methods/hmc/integration.py 78.84% <0.00%> (-3.85%) ⬇️
pymc/parallel_sampling.py 86.33% <0.00%> (-1.00%) ⬇️
pymc/step_methods/hmc/nuts.py 95.00% <0.00%> (-0.63%) ⬇️
pymc/tests/sampler_fixtures.py 90.76% <0.00%> (-0.48%) ⬇️
pymc/distributions/multivariate.py 71.34% <0.00%> (-0.38%) ⬇️
pymc/gp/cov.py 98.07% <0.00%> (ø)
pymc/sampling_jax.py 0.00% <0.00%> (ø)
pymc/bart/bart.py 95.91% <0.00%> (+0.46%) ⬆️
pymc/math.py 68.68% <0.00%> (+0.50%) ⬆️
... and 3 more

@ricardoV94
Copy link
Member

Some of the tests seen to be failing.

@farhanreynaldo
Copy link
Contributor Author

Some of the tests seen to be failing.

This line is failing:

self.check_logcdf(
    ZeroInflatedNegativeBinomial,
    Nat,
    {"psi": Unit, "p": Unit, "n": NatSmall},
    lambda value, psi, p, n: np.log(
        (1 - psi) + psi * sp.nbinom.cdf(value, n, p)
    ),
)

Did I misspecify the {"psi": Unit, "p": Unit, "n": NatSmall} this one?

@ricardoV94
Copy link
Member

From the trace it seems the distribution is not respecting the invalid negative n parameter: https://github.com/pymc-devs/pymc/runs/4276276491?check_suite_focus=true#step:8:429

@ricardoV94
Copy link
Member

ricardoV94 commented Nov 21, 2021

Which is strange because it passes with the mu/alpha parametrization

@ricardoV94
Copy link
Member

ricardoV94 commented Nov 21, 2021

Seems like we have a small bug in the logcdf. We have to add a 0 < n bound condition to the logcdf here:

The case n <= 0 was not being tested independently before apparently. Perhaps because it led to a negative p (when converting from mu) which was enough to activate the bound conditions.

@farhanreynaldo
Copy link
Contributor Author

Seems like we have a small bug in the logcdf. We have to add a 0 < n bound condition to the logcdf here:

The case n<0 was not being tested independently before apparently. Perhaps because it led to a negative p (when converting from mu) which was enough to activate the bound conditions.

I am wondering, so in order to test the n and p parametrization, the logcdf bound (i.e. n<0) should be written explicitly. Why can't we rely on the implicit case where negative n would yield negative p, which activate the 0 < p bounds?

@ricardoV94
Copy link
Member

I am wondering, so in order to test the n and p parametrization, the logcdf bound (i.e. n<0) should be written explicitly. Why can't we rely on the implicit case where negative n would yield negative p, which activate the 0 < p bounds?

We are not computing p from n because the new parametrization is already given in terms of p.

@ricardoV94 ricardoV94 merged commit 988f481 into pymc-devs:main Nov 21, 2021
@ricardoV94
Copy link
Member

Thanks @farhanreynaldo!

@farhanreynaldo
Copy link
Contributor Author

I am wondering, so in order to test the n and p parametrization, the logcdf bound (i.e. n<0) should be written explicitly. Why can't we rely on the implicit case where negative n would yield negative p, which activate the 0 < p bounds?

We are not computing p from n because the new parametrization is already given in terms of p.

Ah right.

@farhanreynaldo farhanreynaldo deleted the np-parametrization branch November 21, 2021 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants