Skip to content

Fix race condition in validators #5762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

michael-k
Copy link
Contributor

This fixes #5760 and also includes the failing test from #5761.

It implements the last mentioned possible fix mentioned in the issue's initial comment.

@@ -532,6 +534,7 @@ def run_validators(self, value):
errors = []
for validator in self.validators:
if hasattr(validator, 'set_context'):
validator = copy.deepcopy(validator)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If set_context means that we need a separate instance, perhaps we should be reconsidering the interface (e.g. set_context returns a new instance).

deepcopy is black magic, we should avoid it if we can.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first thought was to pass the instance to the validator when calling it, and just pass it to all functions instead of saving it on the instance (ie. as self.instance). But not all validators accept a context and using inspect to decide if the validator accepts a context seemed like a bad idea. Last, but not least, this would break all existing custom validators with set_context.

Changing the interface of set_context is also a breaking change for all such custom validators.

So maybe this is something to consider for a 4.0 release, but not for 3.8? I'd love to see a fix for this in the 3.8 release. (I've patched the one place in our code that threw the most exceptions, but not all Validators in all our microservices.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that deepcopy is a good way to fix the issue for now (I do not have an opinion about whether deepcopy is good, bad or magic). Changing the interface or the meaning of set_context is potentially a breaking change, though it would be for a very good reason.

Seeing what this bug is causing (potentially inconsistencies in the database), I believe that a quick fix should be applied.

For a long term solution, it should be considered to NOT store self.instance in the validators set_context method, but to pass it to the validator(...) method. Maybe it is possible to do both for some time and log a notification with a deprecation warning.

Copy link
Contributor Author

@michael-k michael-k Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass it to the validator(...) method

Do you have any good solution how to decide whether to pass it or not, ie. a replacement for hasattr(validator, 'set_context')? It's possible to use any django validator (from the core, from drf, third party or custom) here and most of them do not accept the instance.

I could only think of using inspect (+ py2.7) or try-except.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, that means all those validators would have to be changed, I guess that's not feasible at all (and, afterall, it is definately not going to happen).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible solution:

  • Refactor unique validators to inherit a base InstanceValidator
  • Change the InstanceValidator call signature to expect an instance in addition to the value.
  • Delete the set_context method handling.
  • Handle regular validators first, then instance validators second.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the input. I like it and will look into it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've finally gotten around to tackle this. I've opened a new PR: #6172

Copy link
Collaborator

@carltongibson carltongibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @michael-k.

From the issue:

validator is the same instance for every instance of the serializer: DocumentSerializer(instance=d).validators[0] is DocumentSerializer(instance=d2).validators[0]

So why don't we just deepcopy the validators in BaseSerializer.__init__?

What's with the CreateOnlyDefault and get_default changes? (I'm not quite seeing how they come up...)

@michael-k
Copy link
Contributor Author

So why don't we just deepcopy the validators in BaseSerializer.__init__?

You can pass validators to Field.__init__ (which BaseSerializer is derived of) or you can define validators via the Meta class. And cloning in __init__ would mean some overhead when using the serializer for outgoing data without any validation (ie. Serializer(instance).data). Therefore the __init__ method seems like the wrong place

I think the setter needs to be adjusted as well as get_validators (or the property).

  • @validators.setter
    def validators(self, validators):
    self._validators = validators
  • BaseSerializer
    def get_validators(self):
    """
    Returns a list of validator callables.
    """
    # Used by the lazily-evaluated `validators` property.
    meta = getattr(self, 'Meta', None)
    validators = getattr(meta, 'validators', None)
    return validators[:] if validators else []
  • ModelSerializer
    def get_validators(self):
    """
    Determine the set of validators to use when instantiating serializer.
    """
    # If the validators have been declared explicitly then use that.
    validators = getattr(getattr(self, 'Meta', None), 'validators', None)
    if validators is not None:
    return validators[:]
    # Otherwise use the default set of validators.
    return (
    self.get_unique_together_validators() +
    self.get_unique_for_date_validators()
    )

Cloning in the property would lead to some overhead if the ModelSerializer uses the default set of validators, because they are unique per serializer instance. (But my current implementation does not account for this either.)

# .validators is a lazily loaded property, that gets its default
# value from `get_validators`.
@property
def validators(self):
if not hasattr(self, '_validators'):
self._validators = self.get_validators()
return self._validators

What's with the CreateOnlyDefault and get_default changes? (I'm not quite seeing how they come up...)

I patched all places that call set_context, because I did not see the Fields being deepcopied. But as @xordoquy pointed out in the issue, this might not be necessary:

Most of the time, this will be handled by the fact that field's are deepcopied, except that indeed, root's serializers don't got through it.

@michael-k
Copy link
Contributor Author

Closed in favor of #6172

@rpkilby rpkilby closed this Sep 10, 2018
@michael-k michael-k deleted the race-condition branch June 24, 2020 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Race condition in validators
5 participants