-
Notifications
You must be signed in to change notification settings - Fork 293
Check iterable max_length
as we validate
#602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CodSpeed Performance ReportMerging #602 Summary
Benchmarks breakdown
|
Codecov Report
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. Additional details and impacted files@@ Coverage Diff @@
## main #602 +/- ##
==========================================
- Coverage 94.04% 93.55% -0.49%
==========================================
Files 98 100 +2
Lines 13177 13454 +277
Branches 25 25
==========================================
+ Hits 12392 12587 +195
- Misses 779 861 +82
Partials 6 6
... and 3 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
This comment was marked as outdated.
This comment was marked as outdated.
I see what was going on: we originally didn't even copy the data for |
please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no issues with what I see here, though think @samuelcolvin should also review
For the record, somewhat unrelated to this PR, I tried both VecDeque and unsafe PyList usage (both for initialization and setting items, skipping bounds checks, etc.) and neither is really faster than safe Vec usage. |
([], Err('Input should be a valid dictionary [type=dict_type,')), | ||
([('x', 'y')], Err('Input should be a valid dictionary [type=dict_type,')), | ||
([('x', 'y'), ('z', 'z')], Err('Input should be a valid dictionary [type=dict_type,')), | ||
((), Err('Input should be a valid dictionary [type=dict_type,')), | ||
((('x', 'y'),), Err('Input should be a valid dictionary [type=dict_type,')), | ||
((type('Foobar', (), {'x': 1})()), Err('Input should be a valid dictionary [type=dict_type,')), | ||
([], {}), | ||
([('x', '1')], {'x': 1}), | ||
([('x', '1'), ('z', b'2')], {'x': 1, 'z': 2}), | ||
((), {}), | ||
((('x', '1'),), {'x': 1}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are all accepted now. I also changed the value type to be int
so that we know we're not swapping keys and values validators.
tests/validators/test_frozenset.py
Outdated
Err('Frozenset should have at most 30 items after validation, not 31 [type=too_long,'), | ||
Err('Frozenset should have at most 3 items after validation, not >= 4 [type=too_long,'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because it used to hit the default 10x max_length "input" size, now we're measuring output size, kind of the whole point of this PR.
([1, 2, '3'], [1, 2, 3]), | ||
((1, 2, '3'), [1, 2, 3]), | ||
(deque((1, 2, '3')), [1, 2, 3]), | ||
({1, 2, '3'}, Err('Input should be a valid list [type=list_type,')), | ||
(gen_ints(), [1, 2, 3]), | ||
(frozenset({1, 2, '3'}), Err('Input should be a valid list [type=list_type,')), | ||
({1: 10, 2: 20, '3': '30'}.keys(), [1, 2, 3]), | ||
({1: 10, 2: 20, '3': '30'}.values(), [10, 20, 30]), | ||
({1: 10, 2: 20, '3': '30'}, Err('Input should be a valid list [type=list_type,')), | ||
((x for x in [1, 2, '3']), [1, 2, 3]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests are covered above
@pytest.mark.parametrize( | ||
'input_value,expected', | ||
[ | ||
([], []), | ||
([1, '2', b'3'], [1, '2', b'3']), | ||
(frozenset([1, '2', b'3']), Err('Input should be a valid list [type=list_type,')), | ||
((), []), | ||
((1, '2', b'3'), [1, '2', b'3']), | ||
(deque([1, '2', b'3']), [1, '2', b'3']), | ||
({1, '2', b'3'}, Err('Input should be a valid list [type=list_type,')), | ||
], | ||
) | ||
@pytest.mark.parametrize('input_value,expected', [([], []), ([1, '2', b'3'], [1, '2', b'3'])]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is covered by tests above since we no longer special case "any" item validators.
), | ||
], | ||
) | ||
def test_allow_any_iter(input_value, expected): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The allow_any_iter
parameter is now a no-op, I'll remove it in a followup PR.
({1, 2, '3'}, Err('Input should be a valid tuple [type=tuple_type,')), | ||
(frozenset([1, 2, '3']), Err('Input should be a valid tuple [type=tuple_type,')), | ||
({1, 2, '3'}, AnyOf([tuple(o) for o in sorted(itertools.permutations([1, 2, 3]))])), | ||
(frozenset([1, 2, '3']), AnyOf([tuple(o) for o in sorted(itertools.permutations([1, 2, 3]))])), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accounting for the fact that the resultant tuple can come out in any order.
6cc77e6
to
174c3d4
Compare
@@ -509,7 +509,7 @@ def test_multiple_tuple_recursion(multiple_tuple_schema: SchemaValidator): | |||
with pytest.raises(ValidationError) as exc_info: | |||
multiple_tuple_schema.validate_python({'f1': data, 'f2': data}) | |||
|
|||
assert exc_info.value.errors(include_url=False) == [ | |||
assert exc_info.value.errors(include_url=False) == Contains( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We used to only return one of (1) the errors that occurred for each item (e.g. this recursion error) or (2) the length constraint if the result was too short. This was inconsistent across tuples and lists. Further, having both is helpful: with just one or the other you kind of have to guess why tuple[int]
was too short when you gave it ['a']
which clearly has 1 item or vice versa.
I'm now returning both the errors for each item and the final failure of the length constraint, which @dmontagu had suggested previously.
assert exc_info.value.errors(include_url=False) == Contains( | ||
{'type': 'missing', 'loc': (1,), 'msg': 'Field required', 'input': [1]}, | ||
{'type': 'missing', 'loc': (2,), 'msg': 'Field required', 'input': [1]}, | ||
{'type': 'missing', 'loc': (3,), 'msg': 'Field required', 'input': [1]}, | ||
] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per the comment in test_definitinos_recursive.py
we are now including not just the error for each location but also the final length constraint failure, so I'm using Contains
to ignore the length error since that's not what matters here.
Please review |
I tested this with |
Do you feel the performance consequences are necessary and/or negligible? Seems a lot of the benchmarks have gotten worse (looking at the codspeed output at the top of the PR, and assuming that's up-to-date), but I don't know how practically significant it is.. ? (Happy to ignore the list[Any] thing, but now it seems there are many that have gotten noticeably worse, even without Any being referenced in the name) |
Unfortunately, I think they are inherent to "doing work on every iteration" instead of doing work just at the end, and also using I do think we may be able to improve this down the road. I found that pydantic-core/src/input/iterator.rs Lines 76 to 81 in 9e1b84d
This is weird because that should be very cheap to do, even if it's happening for each item. All I can guess is that it's causing differences in how the code gets compiled. Not much we can do about the set thing: adding items to a All the rest of the tests (e.g. the |
src/input/iterator.rs
Outdated
#[allow(clippy::too_many_arguments)] | ||
pub fn validate_infallible_iterator<'s, 'data, V, O, W, L>( | ||
py: Python<'data>, | ||
input: &'data impl Input<'data>, | ||
extra: &'s Extra<'s>, | ||
definitions: &'data Definitions<CombinedValidator>, | ||
recursion_guard: &'s mut RecursionGuard, | ||
checks: &mut IterableValidationChecks<'data>, | ||
iter: impl Iterator<Item = &'data V>, | ||
items_validator: &'s CombinedValidator, | ||
output: &mut O, | ||
write: &mut W, | ||
len: &L, | ||
) -> ValResult<'data, ()> | ||
where | ||
V: Input<'data> + 'data, | ||
W: FnMut(&mut O, PyObject) -> PyResult<()>, | ||
L: Fn(&O) -> usize, | ||
{ | ||
for (index, value) in iter.enumerate() { | ||
let result = items_validator | ||
.validate(py, value, extra, definitions, recursion_guard) | ||
.map_err(|e| e.with_outer_location(index.into())); | ||
if let Some(value) = checks.filter_validation_result(result, input)? { | ||
write(output, value)?; | ||
checks.check_output_length(len(output), input)?; | ||
} | ||
} | ||
checks.finish(input)?; | ||
Ok(()) | ||
} | ||
|
||
#[allow(clippy::too_many_arguments)] | ||
pub fn validate_fallible_iterator<'s, 'data, V, O, W, L>( | ||
py: Python<'data>, | ||
input: &'data impl Input<'data>, | ||
extra: &'s Extra<'s>, | ||
definitions: &'data Definitions<CombinedValidator>, | ||
recursion_guard: &'s mut RecursionGuard, | ||
checks: &mut IterableValidationChecks<'data>, | ||
iter: impl Iterator<Item = ValResult<'data, &'data V>>, | ||
items_validator: &'s CombinedValidator, | ||
output: &mut O, | ||
write: &mut W, | ||
len: &L, | ||
) -> ValResult<'data, ()> | ||
where | ||
V: Input<'data> + 'data, | ||
W: FnMut(&mut O, PyObject) -> PyResult<()>, | ||
L: Fn(&O) -> usize, | ||
{ | ||
for (index, result) in iter.enumerate() { | ||
let value = result?; | ||
let result = items_validator | ||
.validate(py, value, extra, definitions, recursion_guard) | ||
.map_err(|e| e.with_outer_location(index.into())); | ||
if let Some(value) = checks.filter_validation_result(result, input)? { | ||
write(output, value)?; | ||
checks.check_output_length(len(output), input)?; | ||
} | ||
} | ||
checks.finish(input)?; | ||
Ok(()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Splitting these up (same for the dict.rs
versions) was an attempt to improve performance for the non-fallible case. It seemed to make a tiny difference, but to be honest it might be worth just collapsing into a single function to save a couple hundred LOC.
I did some more testing on those LOC referenced. Boxing |
Okay, I think the breadth and size of the performance degradation here is significant. I'm also somewhat nervous about making a change this big at this stage, the current code has been fairly well tested over the last few months. If we release this in the beta, we loose all the value of the usage testing of the current code. Let's defer this for now and I'll try to have a look at applying some of the behaviour changes included here in the current code base with a smaller change. |
I have an alternative take on this #610 which has a smaller footprint and I think has a smaller impact on performance. But having looked at this a bit more, my opinion is that the current behaviour of checking length after finishing validating elements is correct. I have the following rationale:
It seems like @adriangb and I disagree, let's see what @dmontagu thinks, happy to defer if you both agree, but I think #610 might be a safer way to go, even though it will require a little more work. |
Just to be clear #610 is not handling our current V1 breaking changes w.r..t allowed inputs right? That's a good chunk of this PR. If we remove the set changes here (comment on that below) and fix the performance degradation for the length checks (which again is not inherent to the change but rather related to whatever the compiler is doing) we'd end up with just that part, which is something we should do regardless. For the sets thing, there are more factors than performance. Memory is one of them. If you accumulate in a vec you end up keeping multiple copies of each item in memory, which could be arbitrarily large. I'd argue that's unexpected as a user since it may very well be the reason they're using sets. |
I think allowing sets as inputs to list and tuple is easy. Building a set as we go along involves a lot of copying, maybe even a macro, but isn't conceptually hard. I haven't made those changes until we agree whether the change to length checking is necessary. |
Okay, I think in an ideal world, it would be a config flag so you could decide between stopping early (with possible performance/arguably security benefits) and displaying all errors. I would suggest the default behavior be the more development-time-friendly (imo) show-all-errors. (I consider this more development-time-friendly because it has the chance to reveal additional errors in a single call.) But I see the benefit for production-mode web apps being able to say they want early stopping behavior. I don't think it would be crazy to change to always stopping early, but when in doubt, I guess I'd prefer to preserve v1 behavior? (Not sure what it is, though I suspect it's validate-everything..) Something I would find a lot more convincing about what the behavior should be would be any (GitHub) issues requesting one behavior vs. the other. If there are no requests or stated concerns about this, I think I'd err on the side of what's easier to implement and/or how it works in v1. Regarding the vec vs. one-at-a-time set-building, I see the arguments on both sides. I think from a theoretical perspective the worst case scenario is significantly worse on memory using vec than on performance using one-at-a-time-python-set. However, from a practical perspective, it's harder for me to say; the set of scenarios where memory consumption would be the constraint seems small, but wall-clock performance improvements are likely to benefit everyone using it to some (admittedly, likely small) degree. I don't think using One possible option that I suspect neither of you want to do would be a hybrid approach that uses a pure vec up to a certain size, then converts to set beyond that. (There are 1000 heuristic approaches you could take to optimize beyond that but I sort of think anything beyond this is either going to be obvious to you both or too complex to be desirable. Probably the latter.) |
I'll add that early stopping behavior would be a significant improvement for developer experience for people who want to use this as a way to limit request sizes, as it lets you specify what you care about much more directly than a blanket request size cap, etc. So I see this as a pretty good argument in favor of the (eventual) config setting proposed above if we don't change this behavior now. But I don't know what fraction of people building pydantic-backed webapps find themselves caring deeply about granular request-size capping like this. I guess one other important question is — under what scenarios are these limits typically used? If the typical usage is to prevent large payloads (perhaps misguided since as we've noted it won't stop full validation), then that's an argument for early stopping. If the typical usage is for business logic reasons, I think that would be more ambiguous, but leaning toward show-all-errors. Might be worth a skim of sourcegraph usage of these constraints. |
V1 stops early: from pydantic import BaseModel, ValidationError, Field, validator
calls: list[int] = []
class Inner(BaseModel):
x: int
@validator('x')
def val_x(cls, v: int) -> int:
calls.append(v)
assert v < 10
return v
class Outer(BaseModel):
inner: list[Inner] = Field(max_items=10)
try:
Outer(inner=[{'x': n} for n in range(100)])
except ValidationError:
pass
else:
assert False
assert len(calls) <= 10 The same code in V2 fails on the final assertion.
I don't recall any off the top of my head, but maybe that's because the V1 behavior is the right behavior? For all we know if we change the V1 behavior we'll be flooded with issues and security reports.
I agree, but keep in mind that we're comparing a 10% constant factor worst-case performance degradation against unbounded worst-case memory degradation. And at some point allocating the entire Vec up front (what we currently do, or conversely bump allocating it as we go) could lead to worse performance, or at the very least unpredictable performance that depends on the input. |
Okay, I think those issues are now fixed on #610 - please review and let me know what you think. I think the smaller footprint and fewer preformance regressions is better. |
Yeah, if v1 did the early stopping that's a pretty strong argument in favor of it imo. |
This is now a +200 LOC vs. main, so it's really not that much more code, but still a large diff (1104 insertions(+), 844 deletions(-)). A lot of the diff size is coming from the added benchmarks and tests. |
This is set up so that it will work for all iterables, but I'm only implementing it for lists in this PR to keep things self-contained. That also means the diff is much bigger than it needs to be since I'm not deleting the code this is replacing (I estimate it's about a 1:1 LOC swap).
Selected Reviewer: @dmontagu