-
Notifications
You must be signed in to change notification settings - Fork 36
Add Cosine Kernel #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is great, thanks for submitting! I don't think the scale factor is necessary, since we can handle it in a generic way with the I wonder also whether we could handle the length of the period via the |
Co-Authored-By: willtebbutt <[email protected]>
Co-Authored-By: David Widmann <[email protected]>
src/kernels/cosine.jl
Outdated
|
||
The cosine kernel is a stationary kernel for a sinusoidal given by | ||
``` | ||
κ(x,y) = cos( 2π * (x-y) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since CosineKernel
uses the Cityblock
distance, I guess it should rather be
κ(x,y) = cos( 2π * (x-y) ) | |
κ(x, y) = cos(2π * ‖x-y‖₁) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually is it really the city block metric? I would expect to see the euclidean one. Actually is there a reference for this kernel? That would be nice to also have it in the documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't been able to find a proper reference for the Cosine kernel. The best I could find was GPML's official manual pg44.
Cityblock
metric is something I inferred from here.
GPML's dosctring says (x-y)
. Please note that cosine is an even function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, actually, does the Cosine
kernel make sense in > 1D? i.e. if you use the CityBlock
metric, are the covariance matrices it produces positive definite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GMPL only defines the cosine kernel for one-dimensional data (also stated in the manual), and hence clearly x-y
and |x-y|
yield the same values due to the symmetry of cosine. I agree with @theogf that probably the most natural extension to multiple dimensions would be the Euclidean distance but unfortunately I don't have any reference at hand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just following up on my point above, the following suggests it doesn't make sense in higher dimensions:
using Stheno
using Stheno: Euclidean
import Stheno: pw
pw(k::Cosine, x::ColVecs) = cos.(pi .* pw(Euclidean(), x.X) ./ k.p)
x = ColVecs(randn(3, 10))
eigvals(pw(Cosine(1.0), x))
10-element Array{Float64,1}:
-2.6807114471460594
-1.3962293964372774
-0.963215712851525
-0.40797823094040286
0.9649993076661302
1.6352424163325816
1.8341656854612294
2.543664324231879
3.937344284513905
4.53271876916954
Obviously you'll get different numbers than me because seeds, but looks like it doesn't make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I looked very quickly, the cosine kernel support is only on [0,1] in 1D, which could be expanded to the unit sphere for higher dimensions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, weird. Does it not extend to the entire real line by virtue of the periodicity of cos
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, do you suggest we extend it to multi-dim sphere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know that we have the right abstractions to handle this at the minute. I would suggest sticking with the real line so that we can continue to push forwards quickly. Maybe make a note of it on the overall issue?
I found a good reference : http://www.gaussianprocess.org/gpml/code/matlab/doc/manual.pdf |
@willtebbutt @theogf @devmotion Do we stick to |
|
The adjoints have been added in Zygote by @devmotion now!
Le ven. 20 mars 2020 à 17:35, willtebbutt <[email protected]> a
écrit :
… Euclidean is probably better, because I know that we have the adjoints
implemented somewhere (possibly just in Stheno at the minute)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/theogf/KernelFunctions.jl/pull/45#issuecomment-601795160>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACXSYAYRDB2DER7XH5LZQATRIOLL5ANCNFSM4LLK7OHQ>
.
--
Théo Galy-Fajou
PhD Student/Research Assistant TU Berlin
https://theogf.github.io/
|
src/kernels/cosine.jl
Outdated
""" | ||
struct CosineKernel <: BaseKernel end | ||
|
||
kappa(κ::CosineKernel, d::Real) = cos(2*pi*d) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was just browsing through the Julia documentation and noticed that there is a function cospi
that computes cos(pi*x)
more accurately than the "naive" implementation, particularly for large x
. I guess we could use it here?
kappa(κ::CosineKernel, d::Real) = cos(2*pi*d) | |
kappa(κ::CosineKernel, d::Real) = cospi(2*d) |
I'm also wondering if we should keep the scaling with 2 here - I don't know what's commonly used since the documentation of GPML (no scaling) and its implementation (scaling) are not consistent in that point and GPyTorch does not scale by a factor of 2 (according to the docs). I guess that decision could be based on the default domain that we want to support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also wondering more generally about the validity of this kernel. cf this question : https://www.quora.com/Is-cosine-kernel-as-given-in-Kernel-statistics-a-radial-basis-function-kernel
I think it is indeed only valid when whatever is inside cos
is determined between -pi/2 and pi/2. Is it really psd if inputs are outside of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also noted here : https://en.wikipedia.org/wiki/Kernel_(statistics)#Kernel_functions_in_common_use
Apparently it should return 0 outside
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I just realised that these kernels are more meant for convolution stuff. Nevermind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it help to stay consistent with GPyTorch
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, every package has its own convention... I like pi/2 because then for a distance of 1 the kernel outputs 0
Looks like we will need to merge this before we get to Gabor kernel #52. @theogf @devmotion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Great work!
Looks good to me, even though I'm not sure if we've actually decided on a scaling in the discussion above? Otherwise, just update the docstring to reflect whatever we end up with, and, as mentioned above, I'd suggest using |
Regarding scaling, I support going with |
I'm happy to merge once tests have passed. Can always re-visit at a later date if we find a particularly good reason to adopt a different convention. |
@willtebbutt Looks like the tests have passed. :) |
Thanks! |
Add Cosine kernel as described in GPML.
GPML also provides derivatives of the kernel w.r.t its hyper-parameters. Holding off on that based on the suggestion of @willtebbutt.
Related Issue #44
TODO