Skip to content

TridiagonalConjugation #194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Feb 8, 2025
Merged

TridiagonalConjugation #194

merged 18 commits into from
Feb 8, 2025

Conversation

dlfivefifty
Copy link
Member

@DanielVandH I started looking at this but BidiagonalConjugation seems wrong to me. It is computing inv(U)XV but I think we actually need U*X*inv(V).

For example, consider differentiating Chebyshev U. We have banded conversions T == U*R0 and U == C*R1, and known derivative diff(T) == U*D0. Thus we have

diff(U) == diff(T)*inv(R0) == U*D0*inv(R0) == C*R1*D0*inv(R0)

so we want to represent R1*D0*inv(R0) where R1 and R0 are banded.

Here is a code example:

julia> R0 = U\T
ℵ₀×ℵ₀ BandedMatrix{Float64} with bandwidths (0, 2) with data vcat(1×ℵ₀ FillArrays.Fill{Float64, 2, Tuple{Base.OneTo{Int64}, InfiniteArrays.OneToInf{Int64}}} with indices Base.OneTo(1)×OneToInf(), 1×ℵ₀ FillArrays.Zeros{Float64, 2, Tuple{Base.OneTo{Int64}, InfiniteArrays.OneToInf{Int64}}} with indices Base.OneTo(1)×OneToInf(), hcat(1×1 Ones{Float64}, 1×ℵ₀ FillArrays.Fill{Float64, 2, Tuple{Base.OneTo{Int64}, InfiniteArrays.OneToInf{Int64}}} with indices Base.OneTo(1)×OneToInf()) with indices Base.OneTo(1)×OneToInf()) with indices Base.OneTo(3)×OneToInf() with indices OneToInf()×OneToInf():
 1.0  0.0  -0.5                        
     0.5   0.0  -0.5                    
          0.5   0.0  -0.5               
               0.5   0.0  -0.5          
                    0.5   0.0  -0.5     
                         0.5   0.0    
                              0.5     
                                     
                                     
                                     
                                         

julia> R1 = C\U
ℵ₀×ℵ₀ BandedMatrix{Float64} with bandwidths (0, 2) with data vcat(((-).((Float64) ./ (ℵ₀-element InfiniteArrays.InfStepRange{Float64, Float64} with indices OneToInf()) with indices OneToInf()) with indices OneToInf())' with indices Base.OneTo(1)×OneToInf(), 1×ℵ₀ FillArrays.Zeros{Float64, 2, Tuple{Base.OneTo{Int64}, InfiniteArrays.OneToInf{Int64}}} with indices Base.OneTo(1)×OneToInf(), ((Float64) ./ (ℵ₀-element InfiniteArrays.InfStepRange{Float64, Float64} with indices OneToInf()) with indices OneToInf())' with indices Base.OneTo(1)×OneToInf()) with indices Base.OneTo(3)×OneToInf() with indices OneToInf()×OneToInf():
 1.0  0.0  -0.333333               
     0.5   0.0       -0.25          
          0.333333   0.0   -0.2     
                    0.25   0.0     
                          0.2     
                                
                                 
                                 
                                 
                                 
                                    

julia> D0 = U\diff(T)
ℵ₀×ℵ₀ BandedMatrix{Float64} with bandwidths (-1, 1) with data 1×ℵ₀ adjoint(::InfiniteArrays.InfStepRange{Float64, Float64}) with eltype Float64 with indices Base.OneTo(1)×OneToInf() with indices OneToInf()×OneToInf():
     1.0                            
         2.0                         
             3.0                     
                 4.0                 
                     5.0             
                         6.0        
                             7.0     
                                    
                                    
                                    
                                         

julia> R1*D0*inv(R0)
(ℵ₀×ℵ₀ BandedMatrix{Float64} with bandwidths (0, 2) with data vcat(((-).((Float64) ./ (ℵ₀-element InfiniteArrays.InfStepRange{Float64, Float64} with indices OneToInf()) with indices OneToInf()) with indices OneToInf())' with indices Base.OneTo(1)×OneToInf(), 1×ℵ₀ FillArrays.Zeros{Float64, 2, Tuple{Base.OneTo{Int64}, InfiniteArrays.OneToInf{Int64}}} with indices Base.OneTo(1)×OneToInf(), ((Float64) ./ (ℵ₀-element InfiniteArrays.InfStepRange{Float64, Float64} with indices OneToInf()) with indices OneToInf())' with indices Base.OneTo(1)×OneToInf()) with indices Base.OneTo(3)×OneToInf() with indices OneToInf()×OneToInf()) * (ℵ₀×ℵ₀ BandedMatrix{Float64} with bandwidths (-1, 1) with data 1×ℵ₀ adjoint(::InfiniteArrays.InfStepRange{Float64, Float64}) with eltype Float64 with indices Base.OneTo(1)×OneToInf() with indices OneToInf()×OneToInf()) * (inv(ℵ₀×ℵ₀ BandedMatrix{Float64} with bandwidths (0, 2) with data vcat(1×ℵ₀ FillArrays.Fill{Float64, 2, Tuple{Base.OneTo{Int64}, InfiniteArrays.OneToInf{Int64}}} with indices Base.OneTo(1)×OneToInf(), 1×ℵ₀ FillArrays.Zeros{Float64, 2, Tuple{Base.OneTo{Int64}, InfiniteArrays.OneToInf{Int64}}} with indices Base.OneTo(1)×OneToInf(), hcat(1×1 Ones{Float64}, 1×ℵ₀ FillArrays.Fill{Float64, 2, Tuple{Base.OneTo{Int64}, InfiniteArrays.OneToInf{Int64}}} with indices Base.OneTo(1)×OneToInf()) with indices Base.OneTo(1)×OneToInf()) with indices Base.OneTo(3)×OneToInf() with indices OneToInf()×OneToInf()) with indices OneToInf()×OneToInf()) with indices OneToInf()×OneToInf():
     2.0  0.0  0.0  0.0  0.0  0.0  0.0    
         2.0  0.0  0.0  0.0  0.0  0.0     
             2.0  0.0  0.0  0.0  0.0     
                 2.0  0.0  0.0  0.0     
                     2.0  0.0  0.0     
                         2.0  0.0    
                             2.0     
                                    
                                    
                                    
                                         

julia> C\diff(U)
ℵ₀×ℵ₀ BandedMatrix{Float64} with bandwidths (-1, 1) with data 1×ℵ₀ FillArrays.Fill{Float64, 2, Tuple{Base.OneTo{Int64}, InfiniteArrays.OneToInf{Int64}}} with indices Base.OneTo(1)×OneToInf() with indices OneToInf()×OneToInf():
     2.0                            
         2.0                         
             2.0                     
                 2.0                 
                     2.0             
                         2.0        
                             2.0     
                                    
                                    
                                    
                                         

Do you agree? Or there is a use case for inv(U)XV?

Copy link

codecov bot commented Jan 29, 2025

Codecov Report

Attention: Patch coverage is 92.70833% with 14 lines in your changes missing coverage. Please review.

Project coverage is 87.05%. Comparing base (24f9718) to head (332dde8).
Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
src/banded/tridiagonalconjugation.jl 93.82% 11 Missing ⚠️
src/infqr.jl 72.72% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #194      +/-   ##
==========================================
+ Coverage   85.26%   87.05%   +1.78%     
==========================================
  Files          13       11       -2     
  Lines        1595     1367     -228     
==========================================
- Hits         1360     1190     -170     
+ Misses        235      177      -58     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@DanielVandH
Copy link
Contributor

The inv(U)XV case was needed for differentiating negative parameter bases, i.e. I introduced it in JuliaApproximation/SemiclassicalOrthogonalPolynomials.jl#111 but removed it in JuliaApproximation/SemiclassicalOrthogonalPolynomials.jl#122 since I didn't realise at the time there was a better way to do weighted derivatives.

If there's always this better way to handle weighted derivatives (meaning differentiating $(1-x)P^{t, (a, 1, c)}$ directly instead of $P^{t, (a, 0, c)}L_{\mathrm b, (a, 1, c)}^{t, (a, 0, c)}$ for example) then we probably don't need the inv(U)XV case yeah

@dlfivefifty
Copy link
Member Author

OK that makes sense. It is possible to recover the other version as follows:

julia>         BidiagonalConjugation(R0', D0', R1', :L)'
ℵ₀×ℵ₀ Bidiagonal{Float64, InfiniteLinearAlgebra.BidiagonalConjugationBand{Float64}} with indices OneToInf()×OneToInf():
 0.0  2.0                                                
     0.0  2.0                                             
         0.0  2.0                                         
             0.0  2.0                                     
                 0.0  2.0                                 
                     0.0  2.0                            
                         0.0  2.0                         
                             0.0  2.0                     
                                 0.0  2.0                 
                                     0.0  2.0             
                                         0.0  2.0        
                                                                 

Though I wonder if the default should be swapped.

@dlfivefifty
Copy link
Member Author

The code is very slow though... will have to debug this:

julia>         n = 100_000; @time layout_getindex(B,1:n,1:n);
  0.961080 seconds (16.18 M allocations: 521.940 MiB, 6.31% gc time)

julia>         n = 100_000; @time layout_getindex(B,1:n,1:n);
  0.001608 seconds (6 allocations: 1.526 MiB)

@dlfivefifty
Copy link
Member Author

Oh wait, that is O(n) so it's not super bad. But I suspect it can be made 10x faster by avoiding getindex

@dlfivefifty
Copy link
Member Author

dlfivefifty commented Jan 29, 2025

Actually it was just slow because my R0 was constructed poorly.

EDIT: oh whoops it is still slow with a faster R0

@DanielVandH
Copy link
Contributor

If UXinv(V) is typically the more useful form then yeah I could probably switch the default. It won't break anything since nothing's using it currently

But I suspect it can be made 10x faster by avoiding getindex

Where would we avoid getindex?

@dlfivefifty
Copy link
Member Author

The problem with eg. U[k,k+1] is that lazy arrays getindex is slow since it involves branching. This can be avoided by doing it all at once U[band(0)][1:n]

@dlfivefifty
Copy link
Member Author

Here's an example of the speed difference:

julia> @time [(R0)[k,k] for k=1:n];
  0.048581 seconds (331.83 k allocations: 6.930 MiB, 17.34% gc time, 72.31% compilation time)

julia> @time Vector(R0[band(0)][1:n]);
  0.000080 seconds (25 allocations: 782.062 KiB)

@DanielVandH
Copy link
Contributor

DanielVandH commented Jan 29, 2025

Ah I didn't think about the branching - I thought maybe branch prediction would help. My benchmarking shows around a 4x speedup with band. So e.g. in

function __compute_columns!(data::BidiagonalConjugationData, U, C, i)
ds = data.datasize
up = data.uplo == 'U'
for j in (ds+1):i
up ? _compute_column_up!(data, U, C, j) : _compute_column_lo!(data, U, C, j)
end
data.datasize = i
return data
end

it could be better to not loop over each entry individually, using

function _compute_column_up!(data::BidiagonalConjugationData, U, C, i)
dv, ev = data.dv, data.ev
if i == 1
dv[i] = C[1, 1] / U[1, 1]
else
uᵢ₋₁ᵢ₋₁, uᵢᵢ₋₁, uᵢ₋₁ᵢ, uᵢᵢ = U[i-1, i-1], U[i, i-1], U[i-1, i], U[i, i]
cᵢ₋₁ᵢ, cᵢᵢ = C[i-1, i], C[i, i]
Uᵢ⁻¹ = inv(uᵢ₋₁ᵢ₋₁ * uᵢᵢ - uᵢ₋₁ᵢ * uᵢᵢ₋₁)
dv[i] = Uᵢ⁻¹ * (uᵢ₋₁ᵢ₋₁ * cᵢᵢ - uᵢᵢ₋₁ * cᵢ₋₁ᵢ)
ev[i-1] = Uᵢ⁻¹ * (uᵢᵢ * cᵢ₋₁ᵢ - uᵢ₋₁ᵢ * cᵢᵢ)
end
return data
end
function _compute_column_lo!(data::BidiagonalConjugationData, U, C, i)
dv, ev = data.dv, data.ev
uᵢᵢ, uᵢ₊₁ᵢ, uᵢᵢ₊₁, uᵢ₊₁ᵢ₊₁ = U[i, i], U[i+1, i], U[i, i+1], U[i+1, i+1]
cᵢᵢ, cᵢ₊₁ᵢ = C[i, i], C[i+1, i]
Uᵢ⁻¹ = inv(uᵢᵢ * uᵢ₊₁ᵢ₊₁ - uᵢᵢ₊₁ * uᵢ₊₁ᵢ)
dv[i] = Uᵢ⁻¹ * (uᵢ₊₁ᵢ₊₁ * cᵢᵢ - uᵢᵢ₊₁ * cᵢ₊₁ᵢ)
ev[i] = Uᵢ⁻¹ * (uᵢᵢ * cᵢ₊₁ᵢ - uᵢ₊₁ᵢ * cᵢᵢ)
return data
end
, and instead first get all the entries from U and C and then loop over the allocated vectors?

@dlfivefifty
Copy link
Member Author

Let me implement my own tridiagonal variant in a way that is fast and we can see how to revise the bidiagonal one accordingly.

@dlfivefifty dlfivefifty marked this pull request as ready for review February 8, 2025 10:21
@dlfivefifty dlfivefifty merged commit 84517d2 into master Feb 8, 2025
10 of 12 checks passed
@dlfivefifty dlfivefifty deleted the TriConjugate branch February 8, 2025 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants