Skip to content

Support clenshaw! with any DenseColumnMajor blas vector #113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jul 16, 2020

Conversation

dlfivefifty
Copy link
Member

No description provided.

@MikaelSlevinsky
Copy link
Member

Trying to reactivate coverage

@MikaelSlevinsky
Copy link
Member

The coefficients c can also be strided (with incc).

@dlfivefifty
Copy link
Member Author

So change the 1 to stride(c,1)?

@MikaelSlevinsky
Copy link
Member

I think so.

This is actually really useful because the Julia API will let me fill in the C code incrementally

@dlfivefifty
Copy link
Member Author

dlfivefifty commented Jul 16, 2020

We should probably try multithreading clenshaw!. Though for small length(x) the Julia code is faster than your C code, presumably due to multi-threading... so that will probably take some rough performance tuning on when to thread.

@MikaelSlevinsky
Copy link
Member

Is threading naively turned on in Julia? I thought you needed to use Threads.@threads on a for loop.

ccall((:ft_clenshawf, libfasttransforms), Cvoid, (Cint, Ptr{Float32}, Cint, Cint, Ptr{Float32}, Ptr{Float32}), length(c), c, 1, length(x), x, f)
function _clenshaw!(::AbstractStridedLayout, ::AbstractColumnMajor, ::AbstractColumnMajor, c::AbstractVector{Float32}, x::AbstractVector{Float32}, f::AbstractVector{Float32})
@boundscheck check_clenshaw_points(x, f)
ccall((:ft_clenshawf, libfasttransforms), Cvoid, (Cint, Ptr{Float32}, Cint, Cint, Ptr{Float32}, Ptr{Float32}), length(c), stride(c,1), 1, length(x), x, f)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this changed c instead of 1

@codecov
Copy link

codecov bot commented Jul 16, 2020

Codecov Report

Merging #113 into master will increase coverage by 0.92%.
The diff coverage is 97.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #113      +/-   ##
==========================================
+ Coverage   78.32%   79.25%   +0.92%     
==========================================
  Files          12       13       +1     
  Lines        1352     1417      +65     
==========================================
+ Hits         1059     1123      +64     
- Misses        293      294       +1     
Impacted Files Coverage Δ
src/FastTransforms.jl 100.00% <ø> (ø)
src/libfasttransforms.jl 86.75% <95.45%> (+0.20%) ⬆️
src/clenshaw.jl 98.11% <98.11%> (ø)
src/gaunt.jl 85.13% <0.00%> (+1.35%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f89dbb3...4ffc187. Read the comment docs.

@dlfivefifty
Copy link
Member Author

Yes that's what I mean, add Threads.@threads

@MikaelSlevinsky
Copy link
Member

The C clenshaw isn't threaded. I thought an O(mn) method might best be threadsafe until further consideration. If the Julia code is faster for small point sets it's probably because of higher throughput for lower vectorization levels. Probably need about 16 - 32 points to start to see the difference? I've only been concerned with m = n > 1024, say.

@MikaelSlevinsky
Copy link
Member

OK, feel free to merge when you're happy

@dlfivefifty
Copy link
Member Author

dlfivefifty commented Jul 16, 2020

Got OrthogonalPolynomialsQuasi.jl to call clenshaw! now!
Do this in Chebfun or ApproxFun 🤣

julia> T = Chebyshev();

julia> c = randn(10_000_000);

julia> u = T * [c; zeros(∞)];

julia> x = rand(1000);

julia> @time u[x]
  1.587686 seconds (3 allocations: 8.000 KiB)
1000-element Array{Float64,1}:
 -2312.029629930109
 -2390.9279613489657
 -1354.6587225463163
   626.9532040037507
 -3543.7729254722767
 -2190.2464546247734
 -2572.0111713445567
  -746.4599775088337
   514.8473134322153
  1364.2562428541896
  1155.2279350358774
  -562.0256249035326
 -5933.812790545215
  -244.4875261855841
   690.8312153482333
     
 -2944.8625897604925
   273.0743333139484
 -2318.133247373495
   808.2158822165528
 -1636.3031403509444
 -2500.700269418508
  -881.2745832074958
 -4010.298262876177
  1400.478600291998
  -724.9251230992776
   103.64841191004098
  -990.9982621498198
 -1738.304836473942
  -808.4222444061252

@dlfivefifty dlfivefifty merged commit 9703ea6 into master Jul 16, 2020
@dlfivefifty dlfivefifty deleted the dl/denseclenshaw branch July 16, 2020 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants