Skip to content

Speed up Cholesky Jacobi matrices #169

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Feb 7, 2024
Merged

Conversation

TSGut
Copy link
Member

@TSGut TSGut commented Jan 21, 2024

WIP to address #167. It will also fix some docstring issues.

For now a lot of speed is gained by creating a dedicated method for getindex of Symmetric{Clenshaw}, CholeskyJacobiData as well as better resizing in cholesky_jacobimatrix.

For low polynomial degree of the weight the method is approaching acceptable speeds again, though even with Symmetric Clenshaw now as fast as regular Clenshaw the Clenshaw evals are still a major bottleneck:

julia> P = Normalized(Legendre());
julia> x = axes(P,1);
julia> J = jacobimatrix(P);
julia> wf(x) = (2 .+ x.^4);
julia> W = Symmetric(P \ (wf.(x) .* P));
# raw Clenshaw timing
julia> @time W[1:1000,1:1000];
  0.027092 seconds (18 allocations: 485.281 KiB)
# Cholesky X timing
julia> Jchol = cholesky_jacobimatrix(W, P);
julia> @time Jchol[1:1000,1:1000];
  0.061909 seconds (4.57 k allocations: 8.131 MiB)

Of course for high polynomial degree this causes major slow downs, although there are clearly also other things to optimize:

julia> r = 0.5;
julia> lmin, lmax = (1-sqrt(r))^2,  (1+sqrt(r))^2;
julia> P = Normalized(chebyshevu(lmin..lmax));
julia> x = axes(P,1);
julia> J = jacobimatrix(P);
julia> wf(x) = 1/(x*r);
# raw Clenshaw timing
julia> W = Symmetric(P \ (wf.(x) .* P));
julia> @time W[1:1000,1:1000];
  3.562982 seconds (18 allocations: 10.764 MiB)
# Cholesky X timing
julia> Jchol = cholesky_jacobimatrix(W, P);
julia> @time Jchol[1:1000,1:1000];
  5.413114 seconds (4.60 k allocations: 19.039 MiB)

Todo list:

  • While symmetric and non-symmetric Clenshaw are now equal in speed, Clenshaw evaluation is still very slow which constitutes 50%+ of the compute time spent in cholesky_jacobimatrix.
  • Other optimizations and reduce allocations.
  • Also make changes for QR method once Cholesky is satisfactory
  • Add tests

@TSGut
Copy link
Member Author

TSGut commented Jan 22, 2024

Ok, like I said based on profiling and benchmarking the remaining cost is in evaluating the Clenshaw matrix and performing its Cholesky decompositions.

(I still need to add tests and make tests pass, so not ready for merge yet)

@dlfivefifty How do you want to proceed with this? Is there hope for speeding up Clenshaw or the Cholesky of a Clenshaw? If not then I think this is bottlenecked at the speed of this PR.

To be fair, the computation is fast for weight modifications of low to medium degree weights, just not high degree where Clenshaw starts to struggle. We can do a 1 million size Jacobi matrix sized block in under a second for a degree 2 weight:

julia> P = Normalized(Legendre());

julia> x = axes(P,1);

julia> J = jacobimatrix(P);

julia> wf2(x) = (2 .+ x.^2);

julia> W2 = Symmetric(P \ (wf2.(x) .* P));

julia> Jchol2 = cholesky_jacobimatrix(W2, P);

julia> @time Jchol2[1:1000000,1:1000000];
  0.762198 seconds (16.00 M allocations: 1.103 GiB, 5.34% gc time)

Barring Clenshaw and Cholesky speed-ups, I suspect implementing the rational case is the next best thing.

@dlfivefifty
Copy link
Member

That speed looks fine

Copy link

codecov bot commented Jan 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (5256e8e) 92.70% compared to head (148429d) 92.68%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #169      +/-   ##
==========================================
- Coverage   92.70%   92.68%   -0.03%     
==========================================
  Files          17       17              
  Lines        1864     1886      +22     
==========================================
+ Hits         1728     1748      +20     
- Misses        136      138       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@TSGut TSGut changed the title WIP: Speed up Cholesky Jacobi matrices Speed up Cholesky Jacobi matrices Jan 23, 2024
@TSGut
Copy link
Member Author

TSGut commented Jan 23, 2024

@dlfivefifty Ok this is ready for merge or review. The patch coverage is 100%.

I have also bumped the version number so it would be good to tag after merging, then I will go and make sure SemiclassicalOPs still works as intended.

@TSGut TSGut requested a review from dlfivefifty January 27, 2024 07:06
@dlfivefifty dlfivefifty merged commit d3d03ef into JuliaApproximation:main Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants