Skip to content

Flatten schemas and replace macros with plain code #450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 17, 2023
Merged

Conversation

adriangb
Copy link
Member

@adriangb adriangb commented Mar 16, 2023

Closes #444

@adriangb adriangb force-pushed the flatten-val-build branch from c34243a to 7dc5087 Compare March 16, 2023 06:05
@codspeed-hq
Copy link

codspeed-hq bot commented Mar 16, 2023

CodSpeed Performance Report

Merging #450 flatten-val-build (894e90c) will not alter performances.

Summary

🔥 0 improvements
❌ 1 regressions
✅ 92 untouched benchmarks

🆕 0 new benchmarks
⁉️ 0 dropped benchmarks

Benchmarks breakdown

Benchmark main flatten-val-build Change
test_build_schema 3.6 ms 4.1 ms -13.66%

@codecov-commenter
Copy link

codecov-commenter commented Mar 16, 2023

Codecov Report

Merging #450 (894e90c) into main (653308f) will decrease coverage by 0.46%.
The diff coverage is 87.36%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #450      +/-   ##
==========================================
- Coverage   95.36%   94.90%   -0.46%     
==========================================
  Files          93       93              
  Lines       11115    11157      +42     
  Branches       22       22              
==========================================
- Hits        10600    10589      -11     
- Misses        510      563      +53     
  Partials        5        5              
Impacted Files Coverage Δ
src/serializers/type_serializers/other.rs 84.90% <ø> (-2.79%) ⬇️
src/serializers/type_serializers/tuple.rs 93.07% <ø> (-0.12%) ⬇️
src/validators/tuple.rs 98.85% <ø> (+0.52%) ⬆️
src/validators/float.rs 88.59% <40.00%> (-10.39%) ⬇️
pydantic_core/core_schema.py 97.01% <100.00%> (-0.01%) ⬇️
src/serializers/shared.rs 90.19% <100.00%> (-0.41%) ⬇️
src/serializers/type_serializers/function.rs 93.57% <100.00%> (+0.94%) ⬆️
src/validators/custom_error.rs 100.00% <100.00%> (ø)
src/validators/function.rs 99.02% <100.00%> (-0.04%) ⬇️
src/validators/mod.rs 98.75% <100.00%> (+0.01%) ⬆️
... and 1 more

... and 1 file with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 653308f...894e90c. Read the comment docs.

@adriangb adriangb requested a review from dmontagu March 16, 2023 06:56
@adriangb adriangb marked this pull request as ready for review March 16, 2023 06:56
@adriangb adriangb force-pushed the flatten-val-build branch from 1b2c0c5 to 20837c2 Compare March 16, 2023 15:12
Comment on lines 351 to 360
// The left hand side of this match is a 1:1 match with the `type` field we use as a discriminator
// in our CoreSchema union
// The right hand side is _generally_ a 1:1 match but there are cases where we use a `Builder`
// on the right that may have internal logic to return different validators or just build a
// a more complex validator (e.g. building a union of isinstance validators or something).
// So to get from Python -> Rust implementation you should trace the `type` on the left hand side
// of the match and then if the right hand side is a `{Type}Validator` you've found the implementation.
// If the right hand side is a `{Type}Builder` you'll have to look into it's `build()` method to see
// what it actually returns.
// TODO: sort alphabetically? By left hand side string or right hand side type? Or RHS type's module filename?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmontagu comment as per discussion earlier today

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the flattening looks good.

I really don't agree above removing the macos, sorry.

Cargo.toml Outdated
@@ -1,4 +1,5 @@
[package]
rust-version = "1.59"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

humm, I'm not sure about this, is there a specific version to pin this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry again this is accidental. In trying to debug with that segfault I updated to 1.70 which seems to be totally broken

rust-toolchain Outdated
@@ -1 +1 @@
nightly
nightly-2023-03-01
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not pin this.

Probably better to remove this file completely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this was accidental and unrelated to this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the file is now gone which should make things easier.

$($e_key($e_serializer),)*
$($b_key($b_serializer),)*
}
#[derive(Debug, Clone)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry 😄

I don't agree about removing this macro, it's working, it removes duplication, happy to add a better docstring.

I'd really rather we didn't remove it.

@@ -40,7 +40,8 @@ impl BuildSerializer for FunctionBuilder {
build_context: &mut BuildContext<CombinedSerializer>,
) -> PyResult<CombinedSerializer> {
let py = schema.py();
let mode: &str = schema.get_as_req(intern!(py, "mode"))?;
let type_: &str = schema.get_as_req(intern!(py, "type"))?;
let mode = type_.split_once('-').unwrap().1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat.

@@ -39,7 +22,7 @@ pub struct TupleVariableSerializer {
}

impl TupleVariableSerializer {
fn build(
pub fn build(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should become an implementation of the BuildSerializer trait.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay yes we can do that.

@@ -160,7 +143,7 @@ pub struct TuplePositionalSerializer {
}

impl TuplePositionalSerializer {
fn build(
pub fn build(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same.

}

// macro to build the match statement for validator selection
macro_rules! validator_match {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, I'd rather keep this macro.

let schema: &PyDict = schema.downcast()?;
let type_: &str = schema.get_as_req(intern!(schema.py(), "type"))?;
let val = match type_ {
// The left hand side of this match is a 1:1 match with the `type` field we use as a discriminator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is great, in general, please add it before the macro.

// of the match and then if the right hand side is a `{Type}Validator` you've found the implementation.
// If the right hand side is a `{Type}Builder` you'll have to look into it's `build()` method to see
// what it actually returns.
// TODO: sort alphabetically? By left hand side string or right hand side type? Or RHS type's module filename?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure i agree about alphabetical, in theory validators are ordered logically in core_schema.py, then the order here should ideally match that.

If we make a change, we should make it everywhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever we do, it would be nice to find a consistent order. But we may have to pick between the order in core_schema.py and the order of the implementations in src/validators/*.rs. But let's leave this as a not indicating that there is no order currently and we can loop back to determining and implementing an order in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, it should be consistent.

alphabetic is obviously simplest. problem is it puts closes related validators like model and typed-dict or int and float far apart.

@adriangb
Copy link
Member Author

the flattening looks good.

I really don't agree above removing the macos, sorry.

One isn't really possible without the other. The macros assume a 1:1 mapping between the EXPECTED_TYPE constant and the type key in CoreSchema. That is no longer true (e.g. EXPECTED_TYPE = "function" and "type": "function-wrap").

Besides:

  1. Not having the macros makes it easier to understand what is going on. I think @dmontagu will back me up here.
  2. It's not even more code (I think it's less LOC).
  3. Unlike some of the other macros that get applied 3-4 times and contain a ton of complex logic, where it would be very easy for the implementations to get out of sync if they were duplicated, this logic contains minimal 1:1 duplication and is very easy to test for correctness.

@samuelcolvin
Copy link
Member

One isn't really possible without the other.

I think it is, you would just need a 1:1 mapping between CoreSchema args and BuildValidator, which I think makes sense anyway.

I'm strongly of the opinion that having the type strings copied in multiple places is a recipe for errors down the line.

@adriangb adriangb force-pushed the flatten-val-build branch 2 times, most recently from 362d62d to 5c0d11b Compare March 17, 2023 00:02
@adriangb
Copy link
Member Author

adriangb commented Mar 17, 2023

please review

I added back the macros. I do see the point about duplicating strings, it makes sense. I think a macro that just expands the match statement is helpful, but maybe we can simplify the serializer macro by manually writing the enum? I think enum_dispatch may even yell at you already if you forget an item so it would be duplication yes but maybe not prone to causing errors?

I’ll add back the comments (updated) tomorrow if this hasn’t been merged yet (otherwise another PR)

@adriangb adriangb requested a review from samuelcolvin March 17, 2023 00:03
@adriangb adriangb force-pushed the flatten-val-build branch from 5c0d11b to 894e90c Compare March 17, 2023 21:30
@adriangb adriangb merged commit 5efeaf9 into main Mar 17, 2023
@adriangb adriangb deleted the flatten-val-build branch March 17, 2023 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flatten function and tuple validators/serializers into their own core schema types
3 participants