Skip to content

bpo-46874: Speed up sqlite3 user-defined aggregate 'step' method #31604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 3, 2022

Conversation

erlend-aasland
Copy link
Contributor

@erlend-aasland erlend-aasland commented Feb 27, 2022

Also improve error message if the step method is missing.

https://bugs.python.org/issue46874

@erlend-aasland erlend-aasland requested a review from corona10 March 2, 2022 08:29
Copy link
Member

@corona10 corona10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@corona10 corona10 merged commit 88567a9 into python:main Mar 3, 2022
@erlend-aasland erlend-aasland deleted the sqlite-improve-func branch March 3, 2022 13:54
@erlend-aasland
Copy link
Contributor Author

Thanks, Dong-hee!

@rinsuki
Copy link

rinsuki commented Jul 30, 2022

I just discovered a new bug today, and this bug seems already fixed by this pull-request.

should I still report my bug to issue? or no need to report because already fixed on main/3.11 branch?

Summary about my discovered bug

when passing incomplete aggregate class, and first call (on one query) is with empty data, sqlite3 library raises wrong error.

Repro Code
import sqlite3
import traceback

db = sqlite3.connect(":memory:")
db.execute("CREATE TABLE users (id INTEGER NOT NULL)")
db.execute("INSERT INTO users VALUES (0)")
db.execute("INSERT INTO users VALUES (1)")
db.execute("CREATE TABLE scores (player_id INTEGER NOT NULL, score INTEGER NOT NULL)")
db.execute("INSERT INTO scores (player_id, score) VALUES (1, 10)")

class Median:
    def __init__(self):
        self.data = []

    # ohh i just typo
    def steeep(self, x):
        self.data.append(x)

    def finalize(self):
        data = sorted(self.data)
        ld = len(data)
        if ld == 0:
            return int(0)
        elif ld % 2 == 0:
            return int((data[ld // 2 - 1] + data[ld // 2]) / 2)
        else:
            return int(data[ld // 2])

db.create_aggregate("median", 1, Median)
try:
    cur = db.execute("SELECT player_id, median(score) FROM scores GROUP BY player_id")
    print(cur.fetchall())
except:
    traceback.print_exc()

try:
    cur = db.execute("SELECT users.id, median(score) FROM users LEFT JOIN scores ON users.id = scores.player_id GROUP BY player_id ORDER BY users.id DESC")
    print(cur.fetchall())
except:
    traceback.print_exc()

this repro code prints two error, and they are should be same.

but in 3.11.0a5 or older, python prints different error (wrong behaviour).

Traceback (most recent call last):
  File "/app/sqlite-aggregate-test.py", line 31, in <module>
    cur = db.execute("SELECT player_id, median(score) FROM scores GROUP BY player_id")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'Median' object has no attribute 'step'
Traceback (most recent call last):
  File "/app/sqlite-aggregate-test.py", line 37, in <module>
    cur = db.execute("SELECT users.id, median(score) FROM users LEFT JOIN scores ON users.id = scores.player_id GROUP BY player_id ORDER BY users.id DESC")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.OperationalError: user-defined aggregate's '__init__' method raised error

but 3.11.0a6 or newer, python prints same error (correct behaviour).

Traceback (most recent call last):
  File "/app/sqlite-aggregate-test.py", line 31, in <module>
    cur = db.execute("SELECT player_id, median(score) FROM scores GROUP BY player_id")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.OperationalError: user-defined aggregate's 'step' method not defined
Traceback (most recent call last):
  File "/app/sqlite-aggregate-test.py", line 37, in <module>
    cur = db.execute("SELECT users.id, median(score) FROM users LEFT JOIN scores ON users.id = scores.player_id GROUP BY player_id ORDER BY users.id DESC")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.OperationalError: user-defined aggregate's 'step' method not defined

@erlend-aasland
Copy link
Contributor Author

Hi, @rinsuki! Please open a new ticket for this, and I'll see we can manage to backport this.

I also notice that the error message could be improved by including the object's name instead of just the non-descript "user-defined aggregate".

@rinsuki
Copy link

rinsuki commented Jul 30, 2022

I opened it! #95462

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants