Skip to content

Use tuples in KernelSum and KernelProduct; Make Zygote tests pass for KernelSum and KernelProduct; Improve doctring. #146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Aug 7, 2020
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
using Documenter
using KernelFunctions

DocMeta.setdocmeta!(
KernelFunctions,
:DocTestSetup,
:(using KernelFunctions, LinearAlgebra, Random);
recursive=true,
)

makedocs(
sitename = "KernelFunctions",
format = Documenter.HTML(),
Expand Down
4 changes: 2 additions & 2 deletions docs/src/kernels.md
Original file line number Diff line number Diff line change
Expand Up @@ -246,9 +246,9 @@ Where $\widetilde{k}$ is another kernel and $\sigma^2 > 0$.
The [`KernelSum`](@ref) is defined as a sum of kernels

```math
k(x,x';\{w_i\},\{k_i\}) = \sum_i w_i k_i(x,x'),
k(x,x';\{k_i\}) = \sum_i k_i(x,x'),
```
Where $w_i > 0$.

### KernelProduct

The [`KernelProduct`](@ref) is defined as a product of kernels
Expand Down
4 changes: 2 additions & 2 deletions docs/src/userguide.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,8 @@ For example :
```julia
k1 = SqExponentialKernel()
k2 = Matern32Kernel()
k = 0.5*k1 + 0.2*k2 # KernelSum
k = k1*k2 # KernelProduct
k = 0.5 * k1 + 0.2 * k2 # KernelSum
k = k1 * k2 # KernelProduct
```

## Kernel Parameters
Expand Down
94 changes: 74 additions & 20 deletions src/kernels/kernelproduct.jl
Original file line number Diff line number Diff line change
@@ -1,49 +1,103 @@
"""
KernelProduct(kernels::Array{Kernel})
KernelProduct <: Kernel

Create a product of kernels.
One can also use the operator `*` :
Create a product of kernels. One can also use the overloaded operator `*`.

There are various ways in which you create a `KernelProduct`:

The simplest way to sepcify a `KernelProduct` would be to use the overloaded `*` operator. This is
equivalent to creating a `KernelProduct` by specifying the kernels as the arguments to the constructor.
```jldoctest kernelprod
julia> k1 = SqExponentialKernel(); k2 = LinearKernel(); X = rand(5);

julia> (k = k1 * k2) == KernelProduct(k1, k2)
true

julia> kernelmatrix(k1 * k2, X) == kernelmatrix(k1, X) .* kernelmatrix(k2, X)
true

julia> kernelmatrix(k, X) == kernelmatrix(k1 * k2, X)
true
```
k1 = SqExponentialKernel()
k2 = LinearKernel()
k = KernelProduct([k1, k2]) == k1 * k2
kernelmatrix(k, X) == kernelmatrix(k1, X) .* kernelmatrix(k2, X)
kernelmatrix(k, X) == kernelmatrix(k1 * k2, X)

You could also use specify a `KernelProduct` by providing a `Tuple` or a `Vector` of the
kernels to be summed. We suggest you to use a `Tuple` when you have fewer components
and a `Vector` when dealing with large number of components.
```jldoctest kernelprod
julia> KernelProduct((k1, k2)) == k1 * k2
true

julia> KernelProduct([k1, k2]) == KernelProduct((k1, k2)) == k1 * k2
true
```
"""
struct KernelProduct <: Kernel
kernels::Vector{Kernel}
struct KernelProduct{Tk} <: Kernel
kernels::Tk
end

function KernelProduct(kernel::Kernel, kernels::Kernel...)
return KernelProduct((kernel, kernels...))
end

Base.:*(k1::Kernel,k2::Kernel) = KernelProduct([k1,k2])
Base.:*(k1::KernelProduct,k2::KernelProduct) = KernelProduct(vcat(k1.kernels,k2.kernels)) #TODO Add test
Base.:*(k::Kernel,kp::KernelProduct) = KernelProduct(vcat(k,kp.kernels))
Base.:*(kp::KernelProduct,k::Kernel) = KernelProduct(vcat(kp.kernels,k))
Base.:*(k1::Kernel,k2::Kernel) = KernelProduct(k1, k2)

function Base.:*(
k1::KernelProduct{<:AbstractVector{<:Kernel}},
k2::KernelProduct{<:AbstractVector{<:Kernel}}
)
KernelProduct(vcat(k1.kernels, k2.kernels))
end

function Base.:*(k1::KernelProduct,k2::KernelProduct)
return KernelProduct(k1.kernels..., k2.kernels...) #TODO Add test
end

function Base.:*(k::Kernel, ks::KernelProduct{<:AbstractVector{<:Kernel}})
KernelProduct(vcat(k, ks.kernels))
end

Base.:*(k::Kernel,kp::KernelProduct) = KernelProduct(k, kp.kernels...)

function Base.:*(ks::KernelProduct{<:AbstractVector{<:Kernel}}, k::Kernel)
KernelProduct(vcat(ks.kernels, k))
end

Base.:*(kp::KernelProduct,k::Kernel) = KernelProduct(kp.kernels..., k)

Base.length(k::KernelProduct) = length(k.kernels)

(κ::KernelProduct)(x, y) = prod(k(x, y) for k in κ.kernels)

function kernelmatrix(κ::KernelProduct, x::AbstractVector)
return reduce(hadamard, kernelmatrix(κ.kernels[i], x) for i in 1:length(κ))
return reduce(hadamard, kernelmatrix(k, x) for k in κ.kernels)
end

function kernelmatrix(κ::KernelProduct, x::AbstractVector, y::AbstractVector)
return reduce(hadamard, kernelmatrix(κ.kernels[i], x, y) for i in 1:length(κ))
return reduce(hadamard, kernelmatrix(k, x, y) for k in κ.kernels)
end

function kerneldiagmatrix(κ::KernelProduct, x::AbstractVector)
return reduce(hadamard, kerneldiagmatrix(κ.kernels[i], x) for i in 1:length(κ))
return reduce(hadamard, kerneldiagmatrix(k, x) for k in κ.kernels)
end

function Base.show(io::IO, κ::KernelProduct)
printshifted(io, κ, 0)
end

function Base.:(==)(x::KernelProduct, y::KernelProduct)
return (
length(x.kernels) == length(y.kernels) &&
all(kx == ky for (kx, ky) in zip(x.kernels, y.kernels))
)
end

function printshifted(io::IO, κ::KernelProduct, shift::Int)
print(io, "Product of $(length(κ)) kernels:")
for i in 1:length(κ)
print(io, "\n" * ("\t" ^ (shift + 1))* "- ")
printshifted(io, κ.kernels[i], shift + 2)
for k in κ.kernels
print(io, "\n" )
for _ in 1:(shift + 1)
print(io, "\t")
end
printshifted(io, k, shift + 2)
end
end
114 changes: 71 additions & 43 deletions src/kernels/kernelsum.jl
Original file line number Diff line number Diff line change
@@ -1,73 +1,101 @@
"""
KernelSum(kernels::Array{Kernel}; weights::Array{Real}=ones(length(kernels)))
KernelSum <: Kernel

Create a positive weighted sum of kernels. All weights should be positive.
One can also use the operator `+`
Create a sum of kernels. One can also use the operator `+`.

There are various ways in which you create a `KernelSum`:

The simplest way to sepcify a `KernelSum` would be to use the overloaded `+` operator. This is
equivalent to creating a `KernelSum` by specifying the kernels as the arguments to the constructor.
```jldoctest kernelsum
julia> k1 = SqExponentialKernel(); k2 = LinearKernel(); X = rand(5);

julia> (k = k1 + k2) == KernelSum(k1, k2)
true

julia> kernelmatrix(k1 + k2, X) == kernelmatrix(k1, X) .+ kernelmatrix(k2, X)
true

julia> kernelmatrix(k, X) == kernelmatrix(k1 + k2, X)
true
```
k1 = SqExponentialKernel()
k2 = LinearKernel()
k = KernelSum([k1, k2]) == k1 + k2
kernelmatrix(k, X) == kernelmatrix(k1, X) .+ kernelmatrix(k2, X)
kernelmatrix(k, X) == kernelmatrix(k1 + k2, X)
kweighted = 0.5* k1 + 2.0*k2 == KernelSum([k1, k2], weights = [0.5, 2.0])

You could also use specify a `KernelSum` by providing a `Tuple` or a `Vector` of the
kernels to be summed. We suggest you to use a `Tuple` when you have fewer components
and a `Vector` when dealing with large number of components.
```jldoctest kernelsum
julia> KernelSum((k1, k2)) == k1 + k2
true

julia> KernelSum([k1, k2]) == KernelSum((k1, k2)) == k1 + k2
true
```
"""
struct KernelSum <: Kernel
kernels::Vector{Kernel}
weights::Vector{Real}
struct KernelSum{Tk} <: Kernel
kernels::Tk
end

function KernelSum(kernel::Kernel, kernels::Kernel...)
return KernelSum((kernel, kernels...))
end

function KernelSum(
kernels::AbstractVector{<:Kernel};
weights::AbstractVector{<:Real} = ones(Float64, length(kernels)),
)
@assert length(kernels) == length(weights) "Weights and kernel vector should be of the same length"
@assert all(weights .>= 0) "All weights should be positive"
return KernelSum(kernels, weights)
Base.:+(k1::Kernel, k2::Kernel) = KernelSum(k1, k2)

function Base.:+(
k1::KernelSum{<:AbstractVector{<:Kernel}},
k2::KernelSum{<:AbstractVector{<:Kernel}}
)
KernelSum(vcat(k1.kernels, k2.kernels))
end

Base.:+(k1::Kernel, k2::Kernel) = KernelSum([k1, k2], weights = [1.0, 1.0])
Base.:+(k1::ScaledKernel, k2::ScaledKernel) = KernelSum([kernel(k1), kernel(k2)], weights = [first(k1.σ²), first(k2.σ²)])
Base.:+(k1::KernelSum, k2::KernelSum) =
KernelSum(vcat(k1.kernels, k2.kernels), weights = vcat(k1.weights, k2.weights))
Base.:+(k::Kernel, ks::KernelSum) =
KernelSum(vcat(k, ks.kernels), weights = vcat(1.0, ks.weights))
Base.:+(k::ScaledKernel, ks::KernelSum) =
KernelSum(vcat(kernel(k), ks.kernels), weights = vcat(first(k.σ²), ks.weights))
Base.:+(k::ScaledKernel, ks::Kernel) =
KernelSum(vcat(kernel(k), ks), weights = vcat(first(k.σ²), 1.0))
Base.:+(ks::KernelSum, k::Kernel) =
KernelSum(vcat(ks.kernels, k), weights = vcat(ks.weights, 1.0))
Base.:+(ks::KernelSum, k::ScaledKernel) =
KernelSum(vcat(ks.kernels, kernel(k)), weights = vcat(ks.weights, first(k.σ²)))
Base.:+(ks::Kernel, k::ScaledKernel) =
KernelSum(vcat(ks, kernel(k)), weights = vcat(1.0, first(k.σ²)))
Base.:*(w::Real, k::KernelSum) = KernelSum(k.kernels, weights = w * k.weights) #TODO add tests
Base.:+(k1::KernelSum, k2::KernelSum) = KernelSum(k1.kernels..., k2.kernels...)

function Base.:+(k::Kernel, ks::KernelSum{<:AbstractVector{<:Kernel}})
KernelSum(vcat(k, ks.kernels))
end

Base.:+(k::Kernel, ks::KernelSum) = KernelSum(k, ks.kernels...)

function Base.:+(ks::KernelSum{<:AbstractVector{<:Kernel}}, k::Kernel)
KernelSum(vcat(ks.kernels, k))
end

Base.:+(ks::KernelSum, k::Kernel) = KernelSum(ks.kernels..., k)

Base.length(k::KernelSum) = length(k.kernels)

(κ::KernelSum)(x, y) = sum(κ.weights[i] * κ.kernels[i](x, y) for i in 1:length(κ))
(κ::KernelSum)(x, y) = sum(k(x, y) for k in κ.kernels)

function kernelmatrix(κ::KernelSum, x::AbstractVector)
return sum(κ.weights[i] * kernelmatrix(κ.kernels[i], x) for i in 1:length(κ))
return sum(kernelmatrix(k, x) for k in κ.kernels)
end

function kernelmatrix(κ::KernelSum, x::AbstractVector, y::AbstractVector)
return sum(κ.weights[i] * kernelmatrix(κ.kernels[i], x, y) for i in 1:length(κ))
return sum(kernelmatrix(k, x, y) for k in κ.kernels)
end

function kerneldiagmatrix(κ::KernelSum, x::AbstractVector)
return sum(κ.weights[i] * kerneldiagmatrix(κ.kernels[i], x) for i in 1:length(κ))
return sum(kerneldiagmatrix(k, x) for k in κ.kernels)
end

function Base.show(io::IO, κ::KernelSum)
printshifted(io, κ, 0)
end

function Base.:(==)(x::KernelSum, y::KernelSum)
return (
length(x.kernels) == length(y.kernels) &&
all(kx == ky for (kx, ky) in zip(x.kernels, y.kernels))
)
end

function printshifted(io::IO,κ::KernelSum, shift::Int)
print(io,"Sum of $(length(κ)) kernels:")
for i in 1:length(κ)
print(io, "\n" * ("\t" ^ (shift + 1)) * "- (w = $(κ.weights[i])) ")
printshifted(io, κ.kernels[i], shift + 2)
for k in κ.kernels
print(io, "\n" )
for _ in 1:(shift + 1)
print(io, "\t")
end
printshifted(io, k, shift + 2)
end
end
7 changes: 7 additions & 0 deletions src/kernels/tensorproduct.jl
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,13 @@ end

Base.show(io::IO, kernel::TensorProduct) = printshifted(io, kernel, 0)

function Base.:(==)(x::TensorProduct, y::TensorProduct)
return (
length(x.kernels) == length(y.kernels) &&
all(kx == ky for (kx, ky) in zip(x.kernels, y.kernels))
)
end

function printshifted(io::IO, kernel::TensorProduct, shift::Int)
print(io, "Tensor product of ", length(kernel), " kernels:")
for k in kernel.kernels
Expand Down
2 changes: 1 addition & 1 deletion src/trainable.jl
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ trainable(k::GaborKernel) = (k.kernel,)

trainable(κ::KernelProduct) = κ.kernels

trainable(κ::KernelSum) = (κ.weights, κ.kernels) #To check
trainable(κ::KernelSum) = κ.kernels #To check

trainable(κ::ScaledKernel) = (κ.σ², κ.kernel)

Expand Down
34 changes: 26 additions & 8 deletions test/kernels/kernelproduct.jl
Original file line number Diff line number Diff line change
@@ -1,19 +1,38 @@
@testset "kernelproduct" begin
rng = MersenneTwister(123456)
x = rand(rng)*2
v1 = rand(rng, 3)
v2 = rand(rng, 3)

k1 = LinearKernel()
k2 = SqExponentialKernel()
k3 = RationalQuadraticKernel()
X = rand(rng, 2,2)

k = KernelProduct([k1, k2])
k = KernelProduct(k1,k2)
ks1 = 2.0*k1
ks2 = 0.5*k2
@test length(k) == 2
@test string(k) == (
"Product of 2 kernels:\n\tLinear Kernel (c = 0.0)\n\tSquared " *
"Exponential Kernel"
)
@test k(v1, v2) == (k1 * k2)(v1, v2)
@test (k * k)(v1, v2) ≈ k(v1, v2)^2
@test (k * k3)(v1, v2) ≈ (k3 * k)(v1, v2)
@test (k * k3)(v1,v2) ≈ (k3 * k)(v1, v2)
@test (k1 * k2)(v1, v2) == KernelProduct(k1, k2)(v1, v2)
@test (k * ks1)(v1, v2) ≈ (ks1 * k)(v1, v2)
@test (k * k)(v1, v2) == KernelProduct([k1, k2, k1, k2])(v1, v2)
@test KernelProduct([k1, k2]) == KernelProduct((k1, k2)) == k1 * k2

@testset "kernelmatrix" begin
@test (KernelProduct([k1, k2]) * KernelProduct([k2, k1])).kernels == [k1, k2, k2, k1]
@test (KernelProduct([k1, k2]) * k3).kernels == [k1, k2, k3]
@test (k3 * KernelProduct([k1, k2])).kernels == [k3, k1, k2]

@test (KernelProduct((k1, k2)) * KernelProduct((k2, k1))).kernels == (k1, k2, k2, k1)
@test (KernelProduct((k1, k2)) * k3).kernels == (k1, k2, k3)
@test (k3 * KernelProduct((k1, k2))).kernels == (k3, k1, k2)

@testset "kernelmatrix" begin
rng = MersenneTwister(123456)

Nx = 5
Expand All @@ -22,8 +41,8 @@

w1 = rand(rng) + 1e-3
w2 = rand(rng) + 1e-3
k1 = SqExponentialKernel()
k2 = LinearKernel()
k1 = w1 * SqExponentialKernel()
k2 = w2 * LinearKernel()
k = k1 * k2

@testset "$(typeof(x))" for (x, y) in [
Expand All @@ -47,6 +66,5 @@
@test kerneldiagmatrix!(tmp_diag, k, x) ≈ kerneldiagmatrix(k, x)
end
end
test_ADs(x->SqExponentialKernel() * LinearKernel(c= x[1]), rand(1), ADs = [:ForwardDiff, :ReverseDiff])
@test_broken "Zygote issue"
test_ADs(x->SqExponentialKernel() * LinearKernel(c= x[1]), rand(1), ADs = [:ForwardDiff, :ReverseDiff, :Zygote])
end
Loading