Skip to content

Question: Why prepare statement requires extra round-trip? #1212

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
manswami23 opened this issue Feb 10, 2025 · 5 comments
Open

Question: Why prepare statement requires extra round-trip? #1212

manswami23 opened this issue Feb 10, 2025 · 5 comments

Comments

@manswami23
Copy link

Hi all,

I'm a newbie to rust coming from a java background. I've been doing some comparisons between tokio-postgres and the postgres jdbc driver and was curious about some of the differences I saw.

One difference was that it seemed the postges Parse message was requiring an extra round trip whereas in jdbc, the parse/bind/execute messages are inline. Here's an example captured packet for the latter:

Image

I'm curious, what are the differences between the rust and jdbc driver implementations where the latter eliminates the extra round-trip. So far, the only thing I can tell is the rust implementation looks at the prepared statement output during the Bind phase. e.g. when making sure the caller passes the expected number of inputs:

if params.len() != statement.params().len() {
. On the other hand, I think jdbc calculates the expected number of inputs by parsing the sql on the client-side. Are there any other reasons or examples of dependencies on the Statement result that I'm missing? I would appreciate any insight into this.

Thanks!

@sfackler
Copy link
Owner

So far, the only thing I can tell is the rust implementation looks at the prepared statement output during the Bind phase. e.g. when making sure the caller passes the expected number of inputs:

Yep, that's the reason. You can use query_typed to bypass this if you need to, but if you're that concerned about perf you should probably be preparing the statements up front and reusing them across requests.

@manswami23
Copy link
Author

Thanks @sfackler .

Regarding the query_typed API, is it safe for pipelining and does it guarantee order of execution? Asking because I believe it uses unnamed statements, and I'm not sure how that interacts in a pipeline. I also came across #930 which is a couple years older, but it seemed the consensus was that order of execution was only guaranteed if the statement was prepared up front. Does that still apply for query_typed?

@sfackler
Copy link
Owner

query_typed does use the unnamed statement but only locally to a single command batch so it should be safe in the context of pipelined requests. The discussion in #930 was before #1147.

@manswami23
Copy link
Author

thanks @sfackler

What would be the best way to achieve jdbc's batch behavior (multiple prepared statements with separate bindings but all in one round trip).

  • For my use case, the sql is dynamically generated for each request, so the ability to reuse named statements is somewhat limited. As a result, I'm leaning towards using unnamed statements.
  • Also, the whole batch of sqls is expected to be processed in a single transaction. And so preferably, when the first statement fails, none of the other statements in the transaction execute and the whole thing is rolled back.

At the moment, I've got something like this

let (first_stream, second_stream) = future::try_join4(
    client.query_typed(
            "BEGIN",
            &vec![],
    ),
    client.query_typed_raw(
        "UPDATE table set name = $1",
        table_params_update,
    ),
    client.query_typed_raw(
        "UPDATE other_table set othername = $1",
        other_params_update,
    ),
   client.query_typed(
            "COMMIT",
            &vec![],
    )
).await?;

The only gap I'm seeing is that if the UPDATE table set name = $1 fails, the subsequent update isn't discarded as it is in Jdbc. Instead, it still runs and I get a ERROR: current transaction is aborted, commands ignored until end of transaction block in the query log. I suspect the difference in behavior comes from the Sync message at the end of each query, because after the Sync, the backend will try and fail to process the next query.

Is there any mechanism by which I could force statements after a failure to be discarded? Is the only way for pipelined queries to be followed by a single Sync message instead of a Sync per statement?

@zacknewman
Copy link

zacknewman commented Feb 17, 2025

Is there a reason, you don't use Client::build_transaction for TRANSACTIONs? For maximum performance, you may want to define a PL/pgSQL function, and execute it by using query_typed—better to avoid the overhead of a rather useless PREPARE for a query that is executing a database-side function. PL/pgSQL functions are not limited to single statements; thus you can reduce both the quantity of calls to the database server and the payload size by executing a single function. PL/pgSQL functions internally rely on PREPARE statements in addition to the simple heuristic that uses a generic execution plan avoiding the "cost" of plan generation but only after the query optimizer "proves" it's better than on-demand plan generation. Specifically after 5 executions if the query optimizer estimates that the performance reduction of a generic plan is small enough to justify using a cached plan, will it do so. See plan caching for more information. This basically makes PL/pgSQL functions more powerful than client-side PREPARE statements.

The biggest benefit though is the ability to execute all of your statements with just a single database call. Pipelining improves the situation a bit, but it's still "worse" than what could be done in a single database call (ignoring the calls to BEGIN and COMMIT/ROLLBACK).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants