Skip to content

Add Cosine Kernel #45

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 24, 2020
Merged

Add Cosine Kernel #45

merged 9 commits into from
Mar 24, 2020

Conversation

sharanry
Copy link
Contributor

@sharanry sharanry commented Mar 15, 2020

Add Cosine kernel as described in GPML.

GPML also provides derivatives of the kernel w.r.t its hyper-parameters. Holding off on that based on the suggestion of @willtebbutt.

Related Issue #44

TODO

  • Add tests

@willtebbutt
Copy link
Member

This is great, thanks for submitting!

I don't think the scale factor is necessary, since we can handle it in a generic way with the ScaledKernel.

I wonder also whether we could handle the length of the period via the ScaleTransform, and therefore don't even need that parameter?


The cosine kernel is a stationary kernel for a sinusoidal given by
```
κ(x,y) = cos( 2π * (x-y) )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since CosineKernel uses the Cityblock distance, I guess it should rather be

Suggested change
κ(x,y) = cos( 2π * (x-y) )
κ(x, y) = cos(2π * x-y‖₁)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually is it really the city block metric? I would expect to see the euclidean one. Actually is there a reference for this kernel? That would be nice to also have it in the documentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been able to find a proper reference for the Cosine kernel. The best I could find was GPML's official manual pg44.

Cityblock metric is something I inferred from here.

GPML's dosctring says (x-y). Please note that cosine is an even function.

Copy link
Member

@willtebbutt willtebbutt Mar 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, actually, does the Cosine kernel make sense in > 1D? i.e. if you use the CityBlock metric, are the covariance matrices it produces positive definite?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GMPL only defines the cosine kernel for one-dimensional data (also stated in the manual), and hence clearly x-y and |x-y| yield the same values due to the symmetry of cosine. I agree with @theogf that probably the most natural extension to multiple dimensions would be the Euclidean distance but unfortunately I don't have any reference at hand.

Copy link
Member

@willtebbutt willtebbutt Mar 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just following up on my point above, the following suggests it doesn't make sense in higher dimensions:

using Stheno
using Stheno: Euclidean
import Stheno: pw

pw(k::Cosine, x::ColVecs) = cos.(pi .* pw(Euclidean(), x.X) ./ k.p)

x = ColVecs(randn(3, 10))
eigvals(pw(Cosine(1.0), x))
10-element Array{Float64,1}:
 -2.6807114471460594
 -1.3962293964372774
 -0.963215712851525
 -0.40797823094040286
  0.9649993076661302
  1.6352424163325816
  1.8341656854612294
  2.543664324231879
  3.937344284513905
  4.53271876916954

Obviously you'll get different numbers than me because seeds, but looks like it doesn't make sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I looked very quickly, the cosine kernel support is only on [0,1] in 1D, which could be expanded to the unit sphere for higher dimensions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, weird. Does it not extend to the entire real line by virtue of the periodicity of cos?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, do you suggest we extend it to multi-dim sphere?

Copy link
Member

@willtebbutt willtebbutt Mar 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that we have the right abstractions to handle this at the minute. I would suggest sticking with the real line so that we can continue to push forwards quickly. Maybe make a note of it on the overall issue?

@theogf
Copy link
Member

theogf commented Mar 16, 2020

I found a good reference : http://www.gaussianprocess.org/gpml/code/matlab/doc/manual.pdf
p 44

@sharanry
Copy link
Contributor Author

@willtebbutt @theogf @devmotion Do we stick to CityBlock or use Euclidean? The documentation isn't very clear about it. http://www.gaussianprocess.org/gpml/code/matlab/doc/manual.pdf pg44

@willtebbutt
Copy link
Member

Euclidean is probably better, because I know that we have the adjoints implemented somewhere (possibly just in Stheno at the minute)

@theogf
Copy link
Member

theogf commented Mar 20, 2020 via email

"""
struct CosineKernel <: BaseKernel end

kappa(κ::CosineKernel, d::Real) = cos(2*pi*d)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was just browsing through the Julia documentation and noticed that there is a function cospi that computes cos(pi*x) more accurately than the "naive" implementation, particularly for large x. I guess we could use it here?

Suggested change
kappa::CosineKernel, d::Real) = cos(2*pi*d)
kappa::CosineKernel, d::Real) = cospi(2*d)

I'm also wondering if we should keep the scaling with 2 here - I don't know what's commonly used since the documentation of GPML (no scaling) and its implementation (scaling) are not consistent in that point and GPyTorch does not scale by a factor of 2 (according to the docs). I guess that decision could be based on the default domain that we want to support?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also wondering more generally about the validity of this kernel. cf this question : https://www.quora.com/Is-cosine-kernel-as-given-in-Kernel-statistics-a-radial-basis-function-kernel
I think it is indeed only valid when whatever is inside cos is determined between -pi/2 and pi/2. Is it really psd if inputs are outside of this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also noted here : https://en.wikipedia.org/wiki/Kernel_(statistics)#Kernel_functions_in_common_use
Apparently it should return 0 outside

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I just realised that these kernels are more meant for convolution stuff. Nevermind

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it help to stay consistent with GPyTorch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, every package has its own convention... I like pi/2 because then for a distance of 1 the kernel outputs 0

@sharanry sharanry mentioned this pull request Mar 20, 2020
@sharanry
Copy link
Contributor Author

Looks like we will need to merge this before we get to Gabor kernel #52.
@willtebbutt I have made note in the original issue #44 about extending cosine to hypersphere.

@theogf @devmotion
Do you suggest any other changes or can we merge this?

Copy link
Member

@willtebbutt willtebbutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Great work!

@devmotion
Copy link
Member

Looks good to me, even though I'm not sure if we've actually decided on a scaling in the discussion above? Otherwise, just update the docstring to reflect whatever we end up with, and, as mentioned above, I'd suggest using cospi(...) instead of cos(pi * ...).

@sharanry
Copy link
Contributor Author

Regarding scaling, I support going with κ(x,y) = cos( π * (x-y) ) to keep consistency with the documentation of GPML and GPPytorch. If this is a problem for any user, they can always rescale. Any objections?

@willtebbutt
Copy link
Member

I'm happy to merge once tests have passed. Can always re-visit at a later date if we find a particularly good reason to adopt a different convention.

@sharanry
Copy link
Contributor Author

@willtebbutt Looks like the tests have passed. :)

@willtebbutt willtebbutt merged commit 0654f2f into JuliaGaussianProcesses:master Mar 24, 2020
@sharanry
Copy link
Contributor Author

Thanks!

@sharanry sharanry deleted the cosine branch March 24, 2020 10:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants