-
Notifications
You must be signed in to change notification settings - Fork 36
Add Latent-factor multi-output kernel #143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/mokernels/slfm.jl
Outdated
return sum([κ.g[i](first(x), first(y)) * κ.A[last(x), i] for i in 1:length(κ.g)]) + | ||
κ.e[last(x)](first(x), first(y)) | ||
else | ||
return 0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduces a type instability, maybe we can avoid it by computing some dummy value z
of the same type as in the other branch and then return zero(z)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what you mean. How do we ensure the computed dummy value is of the same type without actually executing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not completely sure, I was still wondering what's the best way to do it. Sometimes one can perform some possibly cheaper dummy operation by, e.g., using zeros instead of actually indexing etc. Alternatively one could always evaluate the first branch and just call zero(res)
on its results res
- however, that will perform superfluous computations if last(x) != last(y)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you worried about the first branch not giving a Float64
output? When could this possibly happen?
Maybe we could explicity impose Float64
on both the outputs.
function (κ::LatentFactorMOKernel)((x, px)::Tuple{Vector, Int}, (y, py)::Tuple{Vector, Int})
if px == py
return Float64(sum([κ.g[i](x, y) * κ.A[px, i] for i in 1:length(κ.g)]) +
κ.e[px](x, y))
else
return Float64(0.0)
end
end
I am not sure if this would cause any problems with AD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it could happen for dual numbers (or basically any other number types). Type instability occurs even if the kernels return values of type Float32 and the elements of A are of type Float32. IMO this example also shows that it's not a good idea to enforce Float64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@willtebbutt @theogf Any suggestions on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not completely clear on why there's this zero branch as the SLFM produces non-zero covariance between all outputs. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused. Which kernel(s) from g
and e
would we use if px!=py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@willtebbutt Is this resolved? I am still unsure how to handle cases when px!=py
.
I dont know if it's worth opening another PR for it specifically, but it would be nice to add a multi output section to the docs |
I think docs as a whole require some fixing. https://juliagaussianprocesses.github.io/KernelFunctions.jl/stable/ they don't seem to load properly. |
Oh that's weird. Maybe we should change the link in the "About" section then. Edit: I changed it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is getting there, just a few things to think about.
src/mokernels/slfm.jl
Outdated
all(gi isa Kernel for gi in g) || error("`g` should be an collection of kernels") | ||
all(ei isa Kernel for ei in e) || error("`e` should be an collection of kernels") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this not be achieved by type checking? i.e. constrain g::AbstractVector{<:Kernel}
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we can but @devmotion suggested we allow tuples of Kernels as inputs too.
src/mokernels/slfm.jl
Outdated
return sum([κ.g[i](first(x), first(y)) * κ.A[last(x), i] for i in 1:length(κ.g)]) + | ||
κ.e[last(x)](first(x), first(y)) | ||
else | ||
return 0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not completely clear on why there's this zero branch as the SLFM produces non-zero covariance between all outputs. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one typo to fix :)
Any objections to merging this? |
Yes -- we still need to figure out what's going on with the off-diagonal kernel elements.
The best equation to look at is 1.7 from my doc. This lays out how to compute the covariance between two inputs that correspond to different outputs. |
How many |
There's exactly one kernel k^e and one kernel k^g in this setup. The output dimension is taken care of by the modified input space (showing up by the additional p, p', and q in the equations). |
So this implies both k^e and k^g would be single multioutput kernels as they would need to handle modified input space? |
Ish. You can definitely look at them that way: k_g((x, p), (y, q)) = p == q ? kernels_g[p](x, y) : 0 but notice that in practice you'll only ever query |
I think that this is pretty much good to go now. The only other thing that I would ask for are that the AD tests are run for at least Zygote. |
@willtebbutt I am not sure how to move forward with this. I am facing problems while defining adjoints for |
This is actually a little tricky at the minute. I would suggest writing out the test yourself (using the same structure as the edit: the same should work for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Once the minor suggestions have been addressed and the patch version bumped, please feel free to go ahead and merge.
This PR adds the Semi-Parametric Latent Factor kernel.
References: