Skip to content

Commit e3fd56b

Browse files
lucianopazjunpenglao
authored andcommitted
Mixture random cleanup (#3364)
* WIP mixture * Still WIP but much better performance on posterior predictive sampling. Still need to work around size issues. * Fixed mixture random method (and also multinomial when using a Cholesky covariance matrix). * Fixed float32 all close relative tolerance error. * Changed the way in which we wrap the comp_dists.random method to something less hackish. At least we dont ignore lint anymore. * Moved comp_dists type checking to __init__. Changed _comp_logp, mean and median to stack on the last axis (which should allow Mixture to work with multidimensional distributions). Changed docstring a bit. Minor change in _comp_samples to enfore dtype. * Improved support for Mixtures of multidimensional distributions. * Added tests for mixtures of multidimensional distributions. * Added kwarg to generate_samples to help pass raw_size_ tuples * Cleaned up infer_comp_dist_shapes and added docstring. * Fixed latent component assignment for mixtures of multidimensional distributions. * Fixed mixture of mixtures errors. The error was in broadcast_distribution_samples. * Patched broadcast_distribution_samples. Must do a cleaner fix. * Fixed broadcast_distribution_samples bugs. Added release notes. * Added tests comparing mixtures to latent component models * Added more examples to Mixture docstring.
1 parent 5b3046b commit e3fd56b

File tree

7 files changed

+684
-164
lines changed

7 files changed

+684
-164
lines changed

RELEASE-NOTES.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44

55
### New features
66

7+
- `Mixture` now supports mixtures of multidimensional probability distributions, not just lists of 1D distributions.
8+
79
### Maintenance
810

911
- All occurances of `sd` as a parameter name have been renamed to `sigma`. `sd` will continue to function for backwards compatibility.
@@ -13,6 +15,13 @@
1315
- Added a fix to allow the imputation of single missing values of observed data, which previously would fail (Fix issue #3122).
1416
- Fix for #3346. The `draw_values` function was too permissive with what could be grabbed from inside `point`, which lead to an error when sampling posterior predictives of variables that depended on shared variables that had changed their shape after `pm.sample()` had been called.
1517
- Fix for #3354. `draw_values` now adds the theano graph descendants of `TensorConstant` or `SharedVariables` to the named relationship nodes stack, only if these descendants are `ObservedRV` or `MultiObservedRV` instances.
18+
- Fixed bug in broadcast_distrution_samples, which did not handle correctly cases in which some samples did not have the size tuple prepended.
19+
- Changed `MvNormal.random`'s usage of `tensordot` for Cholesky encoded covariances. This lead to wrong axis broadcasting and seemed to be the cause for issue #3343.
20+
- Fixed defect in `Mixture.random` when multidimensional mixtures were involved. The mixture component was not preserved across all the elements of the dimensions of the mixture. This meant that the correlations across elements within a given draw of the mixture were partly broken.
21+
- Restructured `Mixture.random` to allow better use of vectorized calls to `comp_dists.random`.
22+
- Added tests for mixtures of multidimensional distributions to the test suite.
23+
- Fixed incorrect usage of `broadcast_distribution_samples` in `DiscreteWeibull`.
24+
- `Mixture`'s default dtype is now determined by `theano.config.floatX`.
1625

1726
### Deprecations
1827

pymc3/distributions/discrete.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -347,12 +347,12 @@ def _ppf(self, p):
347347

348348
def _random(self, q, beta, size=None):
349349
p = np.random.uniform(size=size)
350-
p, q, beta = broadcast_distribution_samples([p, q, beta], size=size)
351350

352351
return np.ceil(np.power(np.log(1 - p) / np.log(q), 1. / beta)) - 1
353352

354353
def random(self, point=None, size=None):
355354
q, beta = draw_values([self.q, self.beta], point=point, size=size)
355+
q, beta = broadcast_distribution_samples([q, beta], size=size)
356356

357357
return generate_samples(self._random, q, beta,
358358
dist_shape=self.shape,

pymc3/distributions/distribution.py

Lines changed: 91 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -254,7 +254,7 @@ class _DrawValuesContextBlocker(_DrawValuesContext, metaclass=InitContextMeta):
254254
"""
255255
def __new__(cls, *args, **kwargs):
256256
# resolves the parent instance
257-
instance = super(_DrawValuesContextBlocker, cls).__new__(cls)
257+
instance = super().__new__(cls)
258258
instance._parent = None
259259
return instance
260260

@@ -599,13 +599,19 @@ def generate_samples(generator, *args, **kwargs):
599599
parameters. This may be required when the parameter shape
600600
does not determine the shape of a single sample, for example,
601601
the shape of the probabilities in the Categorical distribution.
602+
not_broadcast_kwargs: dict or None
603+
Key word argument dictionary to provide to the random generator, which
604+
must not be broadcasted with the rest of the *args and **kwargs.
602605
603606
Any remaining *args and **kwargs are passed on to the generator function.
604607
"""
605608
dist_shape = kwargs.pop('dist_shape', ())
606609
one_d = _is_one_d(dist_shape)
607610
size = kwargs.pop('size', None)
608611
broadcast_shape = kwargs.pop('broadcast_shape', None)
612+
not_broadcast_kwargs = kwargs.pop('not_broadcast_kwargs', None)
613+
if not_broadcast_kwargs is None:
614+
not_broadcast_kwargs = dict()
609615
if size is None:
610616
size = 1
611617

@@ -625,6 +631,8 @@ def generate_samples(generator, *args, **kwargs):
625631
kwargs = {k: v.reshape(v.shape + (1,) * (max_dims - v.ndim)) for k, v in kwargs.items()}
626632
inputs = args + tuple(kwargs.values())
627633
broadcast_shape = np.broadcast(*inputs).shape # size of generator(size=1)
634+
# Update kwargs with the keyword arguments that were not broadcasted
635+
kwargs.update(not_broadcast_kwargs)
628636

629637
dist_shape = to_tuple(dist_shape)
630638
broadcast_shape = to_tuple(broadcast_shape)
@@ -639,20 +647,30 @@ def generate_samples(generator, *args, **kwargs):
639647
samples = generator(size=broadcast_shape, *args, **kwargs)
640648
elif dist_shape == broadcast_shape:
641649
samples = generator(size=size_tup + dist_shape, *args, **kwargs)
642-
elif len(dist_shape) == 0 and size_tup and broadcast_shape[:len(size_tup)] == size_tup:
643-
# Input's dist_shape is scalar, but it has size repetitions.
644-
# So now the size matches but we have to manually broadcast to
645-
# the right dist_shape
646-
samples = [generator(*args, **kwargs)]
647-
if samples[0].shape == broadcast_shape:
648-
samples = samples[0]
650+
elif len(dist_shape) == 0 and size_tup and broadcast_shape:
651+
# There is no dist_shape (scalar distribution) but the parameters
652+
# broadcast shape and size_tup determine the size to provide to
653+
# the generator
654+
if broadcast_shape[:len(size_tup)] == size_tup:
655+
# Input's dist_shape is scalar, but it has size repetitions.
656+
# So now the size matches but we have to manually broadcast to
657+
# the right dist_shape
658+
samples = [generator(*args, **kwargs)]
659+
if samples[0].shape == broadcast_shape:
660+
samples = samples[0]
661+
else:
662+
suffix = broadcast_shape[len(size_tup):] + dist_shape
663+
samples.extend([generator(*args, **kwargs).
664+
reshape(broadcast_shape)[..., np.newaxis]
665+
for _ in range(np.prod(suffix,
666+
dtype=int) - 1)])
667+
samples = np.hstack(samples).reshape(size_tup + suffix)
649668
else:
650-
suffix = broadcast_shape[len(size_tup):] + dist_shape
651-
samples.extend([generator(*args, **kwargs).
652-
reshape(broadcast_shape)[..., np.newaxis]
653-
for _ in range(np.prod(suffix,
654-
dtype=int) - 1)])
655-
samples = np.hstack(samples).reshape(size_tup + suffix)
669+
# The parameter shape is given, but we have to concatenate it
670+
# with the size tuple
671+
samples = generator(size=size_tup + broadcast_shape,
672+
*args,
673+
**kwargs)
656674
else:
657675
samples = None
658676
# Args have been broadcast correctly, can just ask for the right shape out
@@ -686,27 +704,68 @@ def generate_samples(generator, *args, **kwargs):
686704

687705

688706
def broadcast_distribution_samples(samples, size=None):
707+
"""Broadcast samples drawn from distributions taking into account the
708+
size (i.e. the number of samples) of the draw, which is prepended to
709+
the sample's shape.
710+
711+
Parameters
712+
----------
713+
samples: Iterable of ndarrays holding the sampled values
714+
size: None, int or tuple (optional)
715+
size of the sample set requested.
716+
717+
Returns
718+
-------
719+
List of broadcasted sample arrays
720+
721+
Examples
722+
--------
723+
.. code-block:: python
724+
size = 100
725+
sample0 = np.random.randn(size)
726+
sample1 = np.random.randn(size, 5)
727+
sample2 = np.random.randn(size, 4, 5)
728+
out = broadcast_distribution_samples([sample0, sample1, sample2],
729+
size=size)
730+
assert all((o.shape == (size, 4, 5) for o in out))
731+
assert np.all(sample0[:, None, None] == out[0])
732+
assert np.all(sample1[:, None, :] == out[1])
733+
assert np.all(sample2 == out[2])
734+
735+
.. code-block:: python
736+
size = 100
737+
sample0 = np.random.randn(size)
738+
sample1 = np.random.randn(5)
739+
sample2 = np.random.randn(4, 5)
740+
out = broadcast_distribution_samples([sample0, sample1, sample2],
741+
size=size)
742+
assert all((o.shape == (size, 4, 5) for o in out))
743+
assert np.all(sample0[:, None, None] == out[0])
744+
assert np.all(sample1 == out[1])
745+
assert np.all(sample2 == out[2])
746+
"""
689747
if size is None:
690748
return np.broadcast_arrays(*samples)
691749
_size = to_tuple(size)
692-
try:
693-
broadcasted_samples = np.broadcast_arrays(*samples)
694-
except ValueError:
695-
# Raw samples shapes
696-
p_shapes = [p.shape for p in samples]
697-
# samples shapes without the size prepend
698-
sp_shapes = [s[len(_size):] if _size == s[:len(_size)] else s
699-
for s in p_shapes]
700-
broadcast_shape = np.broadcast(*[np.empty(s) for s in sp_shapes]).shape
701-
broadcasted_samples = []
702-
for param, p_shape, sp_shape in zip(samples, p_shapes, sp_shapes):
703-
if _size == p_shape[:len(_size)]:
704-
slicer_head = [slice(None)] * len(_size)
705-
else:
706-
slicer_head = [np.newaxis] * len(_size)
750+
# Raw samples shapes
751+
p_shapes = [p.shape for p in samples]
752+
# samples shapes without the size prepend
753+
sp_shapes = [s[len(_size):] if _size == s[:len(_size)] else s
754+
for s in p_shapes]
755+
broadcast_shape = np.broadcast(*[np.empty(s) for s in sp_shapes]).shape
756+
broadcasted_samples = []
757+
for param, p_shape, sp_shape in zip(samples, p_shapes, sp_shapes):
758+
if _size == p_shape[:len(_size)]:
759+
# If size prepends the shape, then we have to add broadcasting axis
760+
# in the middle
761+
slicer_head = [slice(None)] * len(_size)
707762
slicer_tail = ([np.newaxis] * (len(broadcast_shape) -
708763
len(sp_shape)) +
709764
[slice(None)] * len(sp_shape))
710-
broadcasted_samples.append(param[tuple(slicer_head + slicer_tail)])
711-
broadcasted_samples = np.broadcast_arrays(*broadcasted_samples)
712-
return broadcasted_samples
765+
else:
766+
# If size does not prepend the shape, then we have leave the
767+
# parameter as is
768+
slicer_head = []
769+
slicer_tail = [slice(None)] * len(sp_shape)
770+
broadcasted_samples.append(param[tuple(slicer_head + slicer_tail)])
771+
return np.broadcast_arrays(*broadcasted_samples)

0 commit comments

Comments
 (0)