-
Notifications
You must be signed in to change notification settings - Fork 8
Speed up Cholesky Jacobi matrices #169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Ok, like I said based on profiling and benchmarking the remaining cost is in evaluating the Clenshaw matrix and performing its Cholesky decompositions. (I still need to add tests and make tests pass, so not ready for merge yet) @dlfivefifty How do you want to proceed with this? Is there hope for speeding up Clenshaw or the Cholesky of a Clenshaw? If not then I think this is bottlenecked at the speed of this PR. To be fair, the computation is fast for weight modifications of low to medium degree weights, just not high degree where Clenshaw starts to struggle. We can do a 1 million size Jacobi matrix sized block in under a second for a degree 2 weight: julia> P = Normalized(Legendre());
julia> x = axes(P,1);
julia> J = jacobimatrix(P);
julia> wf2(x) = (2 .+ x.^2);
julia> W2 = Symmetric(P \ (wf2.(x) .* P));
julia> Jchol2 = cholesky_jacobimatrix(W2, P);
julia> @time Jchol2[1:1000000,1:1000000];
0.762198 seconds (16.00 M allocations: 1.103 GiB, 5.34% gc time) Barring Clenshaw and Cholesky speed-ups, I suspect implementing the rational case is the next best thing. |
That speed looks fine |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #169 +/- ##
==========================================
- Coverage 92.70% 92.68% -0.03%
==========================================
Files 17 17
Lines 1864 1886 +22
==========================================
+ Hits 1728 1748 +20
- Misses 136 138 +2 ☔ View full report in Codecov by Sentry. |
@dlfivefifty Ok this is ready for merge or review. The patch coverage is 100%. I have also bumped the version number so it would be good to tag after merging, then I will go and make sure SemiclassicalOPs still works as intended. |
WIP to address #167. It will also fix some docstring issues.
For now a lot of speed is gained by creating a dedicated method for getindex of
Symmetric{Clenshaw}
,CholeskyJacobiData
as well as better resizing incholesky_jacobimatrix
.For low polynomial degree of the weight the method is approaching acceptable speeds again, though even with Symmetric Clenshaw now as fast as regular Clenshaw the Clenshaw evals are still a major bottleneck:
Of course for high polynomial degree this causes major slow downs, although there are clearly also other things to optimize:
Todo list:
cholesky_jacobimatrix
.