Skip to content

Speed-up and improve accuracy with Rump Algorithms (3.1) and (5.10) #101366

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 27, 2023

Conversation

rhettinger
Copy link
Contributor

New timing for len(vec)==100:

% ./time_sumprod.sh
500000 loops, best of 5: 665 nsec per loop

Accuracy test results:

% ./python.exe test_dot.py
1e7  [-16.735704249603927, -16.433613929693916, -16.25421876821838]
1e14 [-16.713231477040374, -16.422704553046025, -16.25668486033119]
1e21 [-16.710166853796867, -16.4211432479268, -16.246020295205902]
1e26 [-16.683511533736446, -16.395569159766588, -16.239464917893343]
1e28 [-16.72597944323953, -16.422877537721703, -16.248693230057754]
1e30 [-16.71065701261481, -16.401881986615052, -16.237891116570623]
1e35 [-14.237586875824306, -13.498240351082394, -12.853021727889974]
n=20  times=1000  version='3.12.0a4+ (heads/sumprod_5_10:720004459b, Jan 27 2023, 01:20:39) '

@rhettinger
Copy link
Contributor Author

Timing script:

./python.exe -m timeit -s 'from math import sumprod' -s 'from random import expovariate as r' -s 'n=100' -s 'v1 = [r() for i in range(n)]' -s 'v2 = [r() for i in range(n)]' 'sumprod(v1, v2)'

Accuracy test script:

'Test generator for sumprod()'

import operator
from fractions import Fraction
from itertools import starmap
from collections import namedtuple
from math import log2, exp2, log10, fabs, sumprod
from random import choices, uniform, shuffle
from statistics import quantiles, mean
from pprint import pp
import sys

DotExample = namedtuple('DotExample', ('x', 'y', 'target_sumprod', 'condition'))

def DotExact(x, y):
    vec1 = map(Fraction, x)
    vec2 = map(Fraction, y)
    return sum(starmap(operator.mul, zip(vec1, vec2, strict=True)))

def Condition(x, y):
    return 2.0 * DotExact(map(abs, x), map(abs, y)) / abs(DotExact(x, y))

def linspace(lo, hi, n):
    width = (hi - lo) / (n - 1)
    return [lo + width * i for i in range(n)]

def GenDot(n, c=1e12):
    """ Algorithm 6.1 (GenDot) works as follows. The condition number (5.7) of
    the dot product xT y is proportional to the degree of cancellation. In
    order to achieve a prescribed cancellation, we generate the first half of
    the vectors x and y randomly within a large exponent range. This range is
    chosen according to the anticipated condition number. The second half of x
    and y is then constructed choosing xi randomly with decreasing exponent,
    and calculating yi such that some cancellation occurs. Finally, we permute
    the vectors x, y randomly and calculate the achieved condition number.
    """

    assert n >= 6
    n2 = n // 2
    x = [0.0] * n
    y = [0.0] * n
    b = log2(c)

    # First half with exponents from 0 to |_b/2_| and random ints in between
    e = choices(range(int(b/2)), k=n2)
    e[0] = int(b / 2) + 1
    e[-1] = 0.0

    x[:n2] = [uniform(-1.0, 1.0) * exp2(p) for p in e]
    y[:n2] = [uniform(-1.0, 1.0) * exp2(p) for p in e]

    # Second half
    e = list(map(round, linspace(b/2, 0.0 , n-n2)))
    for i in range(n2, n):
        x[i] = uniform(-1.0, 1.0) * exp2(e[i - n2])
        y[i] = (uniform(-1.0, 1.0) * exp2(e[i - n2]) - DotExact(x, y)) / x[i]

    # Shuffle
    pairs = list(zip(x, y))
    shuffle(pairs)
    x, y = zip(*pairs)

    return DotExample(x, y, DotExact(x, y), Condition(x, y))

def RelativeError(res, ex):
    x, y, target_sumprod, condition = ex
    n = DotExact(list(x) + [-res], list(y) + [1])
    re = log10(fabs(n / target_sumprod))
    return min(log10(2.0), re)

def Trial(dotfunc, c=10e7, n=10):
    ex = GenDot(10, c)
    res = dotfunc(ex.x, ex.y)
    return RelativeError(res, ex)

def plain_sumprod(x, y):
    total = 0.0
    for x_i, y_i in zip(x, y):
        total += x_i * y_i
    return total

if __name__ == '__main__':

    times = 1000
    n = 20
    for cs in '1e7 1e14 1e21 1e26 1e28 1e30 1e35'.split():
        c = float(cs)
        log_re_errs = [Trial(sumprod, c, n=n) for i in range(times)]
        print(f'{cs:4}', quantiles(log_re_errs))        #  max(log_re_errs))

    version = sys.version
    i = version.index('[')
    version = version[:i]
    print(f'{n=}  {times=}  {version=}')

@rhettinger rhettinger merged commit 7956e0c into python:main Jan 27, 2023
@rhettinger rhettinger deleted the sumprod_5_10 branch January 27, 2023 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage skip issue skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants