Speed-up and improve accuracy with Rump Algorithms (3.1) and (5.10) #101366

rhettinger · 2023-01-27T07:21:57Z

New timing for len(vec)==100:

% ./time_sumprod.sh
500000 loops, best of 5: 665 nsec per loop

Accuracy test results:

% ./python.exe test_dot.py
1e7  [-16.735704249603927, -16.433613929693916, -16.25421876821838]
1e14 [-16.713231477040374, -16.422704553046025, -16.25668486033119]
1e21 [-16.710166853796867, -16.4211432479268, -16.246020295205902]
1e26 [-16.683511533736446, -16.395569159766588, -16.239464917893343]
1e28 [-16.72597944323953, -16.422877537721703, -16.248693230057754]
1e30 [-16.71065701261481, -16.401881986615052, -16.237891116570623]
1e35 [-14.237586875824306, -13.498240351082394, -12.853021727889974]
n=20  times=1000  version='3.12.0a4+ (heads/sumprod_5_10:720004459b, Jan 27 2023, 01:20:39) '

rhettinger · 2023-01-27T07:34:30Z

Timing script:

./python.exe -m timeit -s 'from math import sumprod' -s 'from random import expovariate as r' -s 'n=100' -s 'v1 = [r() for i in range(n)]' -s 'v2 = [r() for i in range(n)]' 'sumprod(v1, v2)'

Accuracy test script:

'Test generator for sumprod()'

import operator
from fractions import Fraction
from itertools import starmap
from collections import namedtuple
from math import log2, exp2, log10, fabs, sumprod
from random import choices, uniform, shuffle
from statistics import quantiles, mean
from pprint import pp
import sys

DotExample = namedtuple('DotExample', ('x', 'y', 'target_sumprod', 'condition'))

def DotExact(x, y):
    vec1 = map(Fraction, x)
    vec2 = map(Fraction, y)
    return sum(starmap(operator.mul, zip(vec1, vec2, strict=True)))

def Condition(x, y):
    return 2.0 * DotExact(map(abs, x), map(abs, y)) / abs(DotExact(x, y))

def linspace(lo, hi, n):
    width = (hi - lo) / (n - 1)
    return [lo + width * i for i in range(n)]

def GenDot(n, c=1e12):
    """ Algorithm 6.1 (GenDot) works as follows. The condition number (5.7) of
    the dot product xT y is proportional to the degree of cancellation. In
    order to achieve a prescribed cancellation, we generate the first half of
    the vectors x and y randomly within a large exponent range. This range is
    chosen according to the anticipated condition number. The second half of x
    and y is then constructed choosing xi randomly with decreasing exponent,
    and calculating yi such that some cancellation occurs. Finally, we permute
    the vectors x, y randomly and calculate the achieved condition number.
    """

    assert n >= 6
    n2 = n // 2
    x = [0.0] * n
    y = [0.0] * n
    b = log2(c)

    # First half with exponents from 0 to |_b/2_| and random ints in between
    e = choices(range(int(b/2)), k=n2)
    e[0] = int(b / 2) + 1
    e[-1] = 0.0

    x[:n2] = [uniform(-1.0, 1.0) * exp2(p) for p in e]
    y[:n2] = [uniform(-1.0, 1.0) * exp2(p) for p in e]

    # Second half
    e = list(map(round, linspace(b/2, 0.0 , n-n2)))
    for i in range(n2, n):
        x[i] = uniform(-1.0, 1.0) * exp2(e[i - n2])
        y[i] = (uniform(-1.0, 1.0) * exp2(e[i - n2]) - DotExact(x, y)) / x[i]

    # Shuffle
    pairs = list(zip(x, y))
    shuffle(pairs)
    x, y = zip(*pairs)

    return DotExample(x, y, DotExact(x, y), Condition(x, y))

def RelativeError(res, ex):
    x, y, target_sumprod, condition = ex
    n = DotExact(list(x) + [-res], list(y) + [1])
    re = log10(fabs(n / target_sumprod))
    return min(log10(2.0), re)

def Trial(dotfunc, c=10e7, n=10):
    ex = GenDot(10, c)
    res = dotfunc(ex.x, ex.y)
    return RelativeError(res, ex)

def plain_sumprod(x, y):
    total = 0.0
    for x_i, y_i in zip(x, y):
        total += x_i * y_i
    return total

if __name__ == '__main__':

    times = 1000
    n = 20
    for cs in '1e7 1e14 1e21 1e26 1e28 1e30 1e35'.split():
        c = float(cs)
        log_re_errs = [Trial(sumprod, c, n=n) for i in range(times)]
        print(f'{cs:4}', quantiles(log_re_errs))        #  max(log_re_errs))

    version = sys.version
    i = version.index('[')
    version = version[:i]
    print(f'{n=}  {times=}  {version=}')

…ythonGH-101366)

Speed-up and improve accuracy with Rump Algorithms (3.1) and (5.10)

7200044

rhettinger added performance Performance or resource usage skip issue skip news labels Jan 27, 2023

bedevere-bot added the awaiting core review label Jan 27, 2023

rhettinger merged commit 7956e0c into python:main Jan 27, 2023

bedevere-bot removed the awaiting core review label Jan 27, 2023

rhettinger deleted the sumprod_5_10 branch January 27, 2023 07:56

mdboom pushed a commit to mdboom/cpython that referenced this pull request Jan 31, 2023

Speed-up and improve accuracy with Rump Algorithms (3.1) and (5.10) (p…

5c35777

…ythonGH-101366)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Speed-up and improve accuracy with Rump Algorithms (3.1) and (5.10) #101366

Speed-up and improve accuracy with Rump Algorithms (3.1) and (5.10) #101366

Uh oh!

rhettinger commented Jan 27, 2023

Uh oh!

rhettinger commented Jan 27, 2023

Uh oh!

Uh oh!

Uh oh!

Speed-up and improve accuracy with Rump Algorithms (3.1) and (5.10) #101366

Speed-up and improve accuracy with Rump Algorithms (3.1) and (5.10) #101366

Uh oh!

Conversation

rhettinger commented Jan 27, 2023

Uh oh!

rhettinger commented Jan 27, 2023

Uh oh!

Uh oh!