Flatten schemas and replace macros with plain code #450

adriangb · 2023-03-16T05:47:41Z

Closes #444

codspeed-hq · 2023-03-16T06:22:44Z

CodSpeed Performance Report

Merging #450 flatten-val-build (894e90c) will not alter performances.

Summary

🔥 0 improvements
❌ 1 regressions
✅ 92 untouched benchmarks

🆕 0 new benchmarks
⁉️ 0 dropped benchmarks

Benchmarks breakdown

	Benchmark	`main`	`flatten-val-build`	Change
❌	`test_build_schema`	3.6 ms	4.1 ms	-13.66%

codecov-commenter · 2023-03-16T06:22:59Z

Codecov Report

Merging #450 (894e90c) into main (653308f) will decrease coverage by 0.46%.
The diff coverage is 87.36%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #450      +/-   ##
==========================================
- Coverage   95.36%   94.90%   -0.46%     
==========================================
  Files          93       93              
  Lines       11115    11157      +42     
  Branches       22       22              
==========================================
- Hits        10600    10589      -11     
- Misses        510      563      +53     
  Partials        5        5

Impacted Files	Coverage Δ
src/serializers/type_serializers/other.rs	`84.90% <ø> (-2.79%)`	⬇️
src/serializers/type_serializers/tuple.rs	`93.07% <ø> (-0.12%)`	⬇️
src/validators/tuple.rs	`98.85% <ø> (+0.52%)`	⬆️
src/validators/float.rs	`88.59% <40.00%> (-10.39%)`	⬇️
pydantic_core/core_schema.py	`97.01% <100.00%> (-0.01%)`	⬇️
src/serializers/shared.rs	`90.19% <100.00%> (-0.41%)`	⬇️
src/serializers/type_serializers/function.rs	`93.57% <100.00%> (+0.94%)`	⬆️
src/validators/custom_error.rs	`100.00% <100.00%> (ø)`
src/validators/function.rs	`99.02% <100.00%> (-0.04%)`	⬇️
src/validators/mod.rs	`98.75% <100.00%> (+0.01%)`	⬆️
... and 1 more

... and 1 file with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 653308f...894e90c. Read the comment docs.

adriangb · 2023-03-16T15:13:30Z

src/validators/mod.rs

+        // The left hand side of this match is a 1:1 match with the `type` field we use as a discriminator
+        // in our CoreSchema union
+        // The right hand side is _generally_ a 1:1 match but there are cases where we use a `Builder`
+        // on the right that may have internal logic to return different validators or just build a
+        // a more complex validator (e.g. building a union of isinstance validators or something).
+        // So to get from Python -> Rust implementation you should trace the `type` on the left hand side
+        // of the match and then if the right hand side is a `{Type}Validator` you've found the implementation.
+        // If the right hand side is a `{Type}Builder` you'll have to look into it's `build()` method to see
+        // what it actually returns.
+        // TODO: sort alphabetically? By left hand side string or right hand side type? Or RHS type's module filename?


@dmontagu comment as per discussion earlier today

samuelcolvin

the flattening looks good.

I really don't agree above removing the macos, sorry.

samuelcolvin · 2023-03-16T20:13:11Z

Cargo.toml

@@ -1,4 +1,5 @@
 [package]
+rust-version = "1.59"


humm, I'm not sure about this, is there a specific version to pin this?

Sorry again this is accidental. In trying to debug with that segfault I updated to 1.70 which seems to be totally broken

samuelcolvin · 2023-03-16T20:16:19Z

rust-toolchain

@@ -1 +1 @@
-nightly
+nightly-2023-03-01


I'd rather not pin this.

Probably better to remove this file completely.

Sorry this was accidental and unrelated to this PR.

the file is now gone which should make things easier.

samuelcolvin · 2023-03-16T20:17:21Z

src/serializers/shared.rs

-            $($e_key($e_serializer),)*
-            $($b_key($b_serializer),)*
-        }
+#[derive(Debug, Clone)]


sorry 😄

I don't agree about removing this macro, it's working, it removes duplication, happy to add a better docstring.

I'd really rather we didn't remove it.

samuelcolvin · 2023-03-16T20:18:20Z

src/serializers/type_serializers/other.rs

@@ -40,7 +40,8 @@ impl BuildSerializer for FunctionBuilder {
        build_context: &mut BuildContext<CombinedSerializer>,
    ) -> PyResult<CombinedSerializer> {
        let py = schema.py();
-        let mode: &str = schema.get_as_req(intern!(py, "mode"))?;
+        let type_: &str = schema.get_as_req(intern!(py, "type"))?;
+        let mode = type_.split_once('-').unwrap().1;


samuelcolvin · 2023-03-16T20:18:57Z

src/serializers/type_serializers/tuple.rs

@@ -39,7 +22,7 @@ pub struct TupleVariableSerializer {
 }

 impl TupleVariableSerializer {
-    fn build(
+    pub fn build(


this should become an implementation of the BuildSerializer trait.

Okay yes we can do that.

samuelcolvin · 2023-03-16T20:19:18Z

src/serializers/type_serializers/tuple.rs

@@ -160,7 +143,7 @@ pub struct TuplePositionalSerializer {
 }

 impl TuplePositionalSerializer {
-    fn build(
+    pub fn build(


samuelcolvin · 2023-03-16T20:20:49Z

src/validators/mod.rs

-}
-
-// macro to build the match statement for validator selection
-macro_rules! validator_match {


again, I'd rather keep this macro.

samuelcolvin · 2023-03-16T20:21:57Z

src/validators/mod.rs

+    let schema: &PyDict = schema.downcast()?;
+    let type_: &str = schema.get_as_req(intern!(schema.py(), "type"))?;
+    let val = match type_ {
+        // The left hand side of this match is a 1:1 match with the `type` field we use as a discriminator


this comment is great, in general, please add it before the macro.

samuelcolvin · 2023-03-16T20:22:44Z

src/validators/mod.rs

+        // of the match and then if the right hand side is a `{Type}Validator` you've found the implementation.
+        // If the right hand side is a `{Type}Builder` you'll have to look into it's `build()` method to see
+        // what it actually returns.
+        // TODO: sort alphabetically? By left hand side string or right hand side type? Or RHS type's module filename?


Not sure i agree about alphabetical, in theory validators are ordered logically in core_schema.py, then the order here should ideally match that.

If we make a change, we should make it everywhere.

Whatever we do, it would be nice to find a consistent order. But we may have to pick between the order in core_schema.py and the order of the implementations in src/validators/*.rs. But let's leave this as a not indicating that there is no order currently and we can loop back to determining and implementing an order in the future.

agreed, it should be consistent.

alphabetic is obviously simplest. problem is it puts closes related validators like model and typed-dict or int and float far apart.

src/validators/mod.rs

adriangb · 2023-03-16T20:29:38Z

the flattening looks good.

I really don't agree above removing the macos, sorry.

One isn't really possible without the other. The macros assume a 1:1 mapping between the EXPECTED_TYPE constant and the type key in CoreSchema. That is no longer true (e.g. EXPECTED_TYPE = "function" and "type": "function-wrap").

Besides:

Not having the macros makes it easier to understand what is going on. I think @dmontagu will back me up here.
It's not even more code (I think it's less LOC).
Unlike some of the other macros that get applied 3-4 times and contain a ton of complex logic, where it would be very easy for the implementations to get out of sync if they were duplicated, this logic contains minimal 1:1 duplication and is very easy to test for correctness.

samuelcolvin · 2023-03-16T22:27:38Z

One isn't really possible without the other.

I think it is, you would just need a 1:1 mapping between CoreSchema args and BuildValidator, which I think makes sense anyway.

I'm strongly of the opinion that having the type strings copied in multiple places is a recipe for errors down the line.

adriangb · 2023-03-17T00:03:50Z

please review

I added back the macros. I do see the point about duplicating strings, it makes sense. I think a macro that just expands the match statement is helpful, but maybe we can simplify the serializer macro by manually writing the enum? I think enum_dispatch may even yell at you already if you forget an item so it would be duplication yes but maybe not prone to causing errors?

I’ll add back the comments (updated) tomorrow if this hasn’t been merged yet (otherwise another PR)

adriangb force-pushed the flatten-val-build branch from c34243a to 7dc5087 Compare March 16, 2023 06:05

adriangb requested a review from dmontagu March 16, 2023 06:56

adriangb marked this pull request as ready for review March 16, 2023 06:56

adriangb force-pushed the flatten-val-build branch from 1b2c0c5 to 20837c2 Compare March 16, 2023 15:12

adriangb commented Mar 16, 2023

View reviewed changes

samuelcolvin reviewed Mar 16, 2023

View reviewed changes

adriangb force-pushed the flatten-val-build branch 2 times, most recently from 362d62d to 5c0d11b Compare March 17, 2023 00:02

adriangb requested a review from samuelcolvin March 17, 2023 00:03

Flatten CoreSchema types to get a single discriminant key

894e90c

adriangb force-pushed the flatten-val-build branch from 5c0d11b to 894e90c Compare March 17, 2023 21:30

adriangb merged commit 5efeaf9 into main Mar 17, 2023

adriangb deleted the flatten-val-build branch March 17, 2023 23:29

samuelcolvin pushed a commit that referenced this pull request Mar 20, 2023

Flatten CoreSchema types to get a single discriminant key (#450)

52c41ff

		@@ -1 +1 @@
		nightly
		nightly-2023-03-01

Flatten schemas and replace macros with plain code #450

Flatten schemas and replace macros with plain code #450

Uh oh!

Conversation

adriangb commented Mar 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Mar 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmarks breakdown

Uh oh!

codecov-commenter commented Mar 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

samuelcolvin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adriangb commented Mar 16, 2023

Uh oh!

samuelcolvin commented Mar 16, 2023

Uh oh!

adriangb commented Mar 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

adriangb commented Mar 16, 2023 •

edited

Loading

codspeed-hq bot commented Mar 16, 2023 •

edited

Loading

codecov-commenter commented Mar 16, 2023 •

edited

Loading

adriangb commented Mar 17, 2023 •

edited

Loading