Skip to content
vene edited this page Jul 2, 2012 · 2 revisions

FastICA

The design in FastICA's callable nonlinear functions G and G' was suboptimal by being separated. Very often the derivative can be computed faster if it's computed at the same time as the function, and this fits FastICA, because they are always computed at the same time.

For small datasets, the calls to g and gprime sometimes overthrow the costs to np.dot and eigh.

For a random X of shape (500, 500), here is the before and after. This is a small but consistent speedup, and for certain functions it could make more of a difference.

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1599   30.089    0.019   30.089    0.019 {numpy.core._dotblas.dot}
      200   25.427    0.127   25.788    0.129 decomp.py:196(eigh)
        1    1.678    1.678   59.546   59.546 fastica_.py:94(_ica_par)
      199    1.490    0.007    1.490    0.007 fastica_.py:229(g)
      200    0.284    0.001   41.156    0.206 fastica_.py:40(_sym_decorrelation)

parallel logcosh:
1 loops, best of 3: 16.8 s per loop
parallel exp:
1 loops, best of 3: 16.4 s per loop
parallel cube:
1 loops, best of 3: 911 ms per loop
deflation logcosh:
1 loops, best of 3: 3.93 s per loop
deflation exp:
1 loops, best of 3: 7.45 s per loop
deflation cube:
1 loops, best of 3: 1.13 s per loop
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1599   28.578    0.018   28.578    0.018 {numpy.core._dotblas.dot}
      200   27.604    0.138   27.944    0.140 decomp.py:196(eigh)
        1    1.615    1.615   61.189   61.189 fastica_.py:88(_ica_par)
      199    1.569    0.008    1.570    0.008 fastica_.py:219(gprime)
      199    0.995    0.005    0.995    0.005 fastica_.py:215(g)
      200    0.272    0.001   42.587    0.213 fastica_.py:40(_sym_decorrelation)

parallel logcosh:
1 loops, best of 3: 16.6 s per loop
parallel exp:
1 loops, best of 3: 15.9 s per loop
parallel cube:
1 loops, best of 3: 901 ms per loop
deflation logcosh:
1 loops, best of 3: 3.92 s per loop
deflation exp:
1 loops, best of 3: 6.86 s per loop
deflation cube:
1 loops, best of 3: 1.05 s per loop

This is selected output from %prun along with the output of this code:

import numpy as np
from sklearn.decomposition import fastica

np.random.seed(0)
X = np.random.randn(300, 300)
for algorithm in ('parallel', 'deflation'):
        for fun in ('logcosh', 'exp', 'cube'):
                print '%s %s:' % (algorithm, fun)
                %timeit fastica(X, algorithm=algorithm, fun=fun)
Clone this wiki locally