-
Notifications
You must be signed in to change notification settings - Fork 36
Add Spectral Mixture Kernel #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Spectral Mixture Kernel #80
Conversation
Having a few issues coding up the SM's kappa. The dimensions don't seem seem to add up. Trying to find answers in the references. |
Also, do you think it is better to have Spectral Mixture and Spectral Mixture Product as two separate kernels? I didn't really like the GPML's setup. |
@sharanry GPML is a good base and it's interesting to enlarge our collection with the multiple examples they have. But we don't have to copy exactly all what they are doing. If you feel another structure is more appropriate you should go fir it. |
I'm don't believe that this is the best way to implement the spectral mixture. @sharanry is there a reason not to do this by simply writing a function that accepts the parameters of a spectral mixture and spits out a kernel built by combining the existing building blocks? |
@willtebbutt I am trying to do that using Problem with using Problem in general with using pre existing kernels - This is why I thought it would be cleaner to implement from scratch. Do you suggest I use preexisting kernels? |
Okay, maybe I've misunderstood something. Considering first just the 1-dimensional SM kernel, is there any reason that we can't implement that as sum(ws .* SqExponential(sigmas) .* Cosine(mus)) with appropriate re-scalings to match the parametrisations in the paper? edit: there are also things like [1], so it would be really nice to be able to parametrise a larger space of spectral mixture kernels than just the ones involving Exponentiated Quadratic kernels. (See eqn 6) [1] - Samo, Yves-Laurent Kom, and Stephen Roberts. "Generalized spectral kernels." arXiv preprint arXiv:1506.02236 (2015). |
@willtebbutt I think something like Edit:
but applying the two kernels kappas and taking their product would give (product of two reals)
which are quite different? One possible way I can think of addressing this issue while still using |
Yeah, so for the separable multi-dimensional kernel, we need some kind of prod(map(d -> k[d](x[d], y[d]), 1:D)) Once you've got this, constructing a spectral mixture kernel that is separable over dimensions, such as the one in the GPatt paper, is straightforward. It's pretty common to want separable kernels, so we should definitely have this abstraction in the |
I think so too. I mentioned something like this just now in the edit of my earlier comment. |
@willtebbutt Do you suggest I open a separate PR for this? |
Yeah. I reckon implement a PR that does the separable stuff, then come back to this one. |
I would be happy to make a PR to make such a kernel, what should it be called? |
I would suggest |
Isn't that a general tensor product kernel? 🤔 See #56 for a discussion of a version with just two kernels. |
Yes, yes it is. Would be great if that PR could be extended to work nicely in D-dimensions. I'm thinking along the same lines as the discussions around sum / product kernels, where you allow for sums / product over arbitrarily many kernels, but allow the user to specify the storage to ensure efficiency over a wide range of components in the sum / product / tensor-product. |
I think it should now be possible to implement this as a function, without introducing a new type. @sharanry are you happy with how that would look? |
@willtebbutt I am currently trying to implement this using However, |
Regarding the spectral mixture kernel, I would suggest taking a look at equation 6 of [1] (we don't need to implement precisely the parametrisation originally suggested) -- it's a strict generalised of equation 5 in the same paper, which is AGW's spectral mixture kernel for multi-dimensional inputs. We should be able to implement it just as a sum of Gabor kernels, or more generally something like sum(alphas .* StretchTransform.(SqExponential(), gammas) .* Cosine.(omegas)) (this definitely isn't exactly correct, but hopefully the gist is clear). Then we could just write a function that takes in the parameters Certainly it won't be the case that we can use the tensor product kernel for anything other than the tensor product version of the single-dimensional spectral mixture as it's quite a restrictive form to impose on a multi-dimensional kernel. [1] - Samo, Yves-Laurent Kom, and Stephen Roberts. "Generalized spectral kernels." arXiv preprint arXiv:1506.02236 (2015). |
@willtebbutt Thanks! I will go through the paper and get back to you. |
@willtebbutt I had the chance to go through the paper you mentioned and their way of parametrization. To confirm what you meant, the idea is to create a transform called |
The problem I am facing right now is that Edit:
|
You're completely right @sharanry , good point. It is what we want for the input to |
Nice changes :) |
@willtebbutt @devmotion Do you suggest any other changes or can this be merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the parameters for spectral_mixture_kernel
and spectral_mixture_product_kernel
are represented using row-major storage. This probably isn't optimal since Julia uses column-major, so it's the wrong access pattern really. I think you should change the parameter matrices so be of size D
x K
, rather than K
x D
.
This should simplify the implementations a bit. For example, you should be able to relax the AbstractMatrix
type constraint to AbstractVecOrMat
for the spectral_mixture_kernel
inputs, so that there's no need to reshape stuff in spectral_mixture_product_kernel
.
@willtebbutt Currently, I believe Edit: |
You're totally right about this, my mistake. The other alternative here would be the vector-of-vector approach, as we've taken with the inputs, but I think the current implementation is probably fine. I'm happy for this to be merged now :) Nice work. edit: @sharanry could you add a default for the |
LGTM! Will merge once tests pass if you're happy @sharanry |
LGTM too. :) |
Issue #44
Gaussian Spectral Mixture kernel function. The kernel function
parametrization depends on the sign of Q.
Let
t(Dx1)
be an offset vector in dataspace e.g.t = x-z
. Thenw(DxP)
are the weights and
m(Dx|Q|) = 1/p
,v(Dx|Q|) = (2*pi*ell)^-2
are spectralmeans (frequencies) and variances, where
p
is the period andell
the lengthscale of the Gabor function
h(t2v,tm)
given by the expressionThen, the two covariances are obtained as follows:
SM, spectral mixture: Q>0 => P = 1
SMP, spectral mixture product: Q<0 => P = D
Note that for D=1, the two modes +Q and -Q are exactly the same.
References:\
[1] SM: Gaussian Process Kernels for Pattern Discovery and Extrapolation,
ICML, 2013, by Andrew Gordon Wilson and Ryan Prescott Adams,
[2] SMP: GPatt: Fast Multidimensional Pattern Extrapolation with GPs,
arXiv 1310.5288, 2013, by Andrew Gordon Wilson, Elad Gilboa,
Arye Nehorai and John P. Cunningham, and
[3] Covariance kernels for fast automatic pattern discovery and extrapolation
with Gaussian processes, Andrew Gordon Wilson, PhD Thesis, January 2014.
http://www.cs.cmu.edu/~andrewgw/andrewgwthesis.pdf
[4] http://www.cs.cmu.edu/~andrewgw/pattern/.