Skip to content

$combine: Re-use with "additionalProperties": false #119

Closed
@handrews

Description

@handrews

[Filing at the request of @epoberezkin in #15]

Background

This is a proposed solution for one of the ways people want to re-use schemas that declare "additionalProperties": false. It could be made part of the JSON Schema specification, either as a required or optional feature, or it could simply be published to illustrate how a completely separate tool could be built to work around the limitation. This separability hinges on the proposal being in the form of a preprocessing step.

The reason to have the separability is that there is no other way to address this use case without violating at least one desirable property of JSON Schema validation, or introducing arbitrary JSON transformation capabilities into a standard that should be focused on schemas.

Allowing for partial implementation may also be a good idea, as any full solution will be complex to implement. A useful but limited solution is quite easy.

Keywords $combine and $combinable

There are two keywords in this proposal, both marked with a $ to indicate that they (like $ref) are structural elements outside of validation proper, and can be preprocessed out (either entirely, or lazily as needed in the case of $ref cycles).

$combine implements the oft-requested re-use pattern of applying additionalProperties only after inventorying all properties and patternProperties in all constituent schemas.

$combinable is a boolean keyword. If false, then $combine has no special effect on the schema- it behaves like allOf (explained in detail below). This preserves an the use case of using additionalProperties to intentionally close a schema off to this sort of "consider all properties" extension.

If $combinable defaults to true, then current schemas must opt-out of being usable with $combine. If it defaults to false, then current schemas must opt-in. Opting in would be compatible with the current draft, so I recommend that $combinable default to false to avoid automatically circumventing any intent to close a schema to re-use.

Formal description of the limitation of additionalProperties

Note: This assumes that #101 (allow true and false for all child schemas) is implemented, as it makes this whole thing about ten times more clear).

For nearly all validation keywords k1, k2, k3... the following (which @awwright referred to as "linearity" in issue #65 ) holds true:

{
    "k1": "something",
    "k2": {},
    "k3": ["etc."]
}

is equivalent to

{
    "allOf": [
        {"k1": "something"},
        {"k2": {}},
        {"k3": ["etc."]}
    ]
}

There are only a few exceptions (see also issue #77 ), and they tend to be vexing. additionalProperties is the one that trips people up the most. Here are its properties in terms of "schema algebra" (for lack of a better term):

{
    "required": ["bar", "x-awesomeness"],
    "minProperties": 5,
    "maxProperties": 10,
    "properties": {
       "foo": {"type": "number"},
       "bar": {"type": "boolean"}
    },  
    "patternProperties": {
        "^x-": {"type": "string"},
        "maybe$": {"type": ["boolean", "null"]}
    },  
    "additionalProperties": {
        "type": "object"
    }   
}

is equivalent to

{
    "allOf": [
        {"required": ["bar"]},
        {"required": ["x-awesomeness"]},
        {"minProperties": 5},
        {"maxProperties": 10},
        {"properties": {"foo": {"type": "number"}}},
        {"properties": {"bar": {"type": "boolean"}}},
        {"patternProperties": {"^x-": {"type": "string"}}},
        {"patternProperties": {"maybe$": {"type": ["boolean", "null"]}}},
        {   
            "additionalProperties": {"type": "object"},
            "properties": {"foo": true, "bar": true},
            "patternProperties": {"^x-": true, "maybe$": true} 
        }   
    ]   
}

You can separate each entry in properties and patternProperties, and handle the child validation in each of those separate entries, but the only simplification that can be done to a factored-out additionalProperties is to reduce the properties and patternProperties down to just validating the instance object property names (by giving blank or true child schemas) and leaving the child validation to the separated schemas.

This is because additionalProperties determines how to behave by taking the set of instance properties, removing all properties that match either properties or patternProperties, and applying its child schema to the value of any properties that remain. So the child schemas in properties and patternProperties have no effect on additionalProperties and can be factored out, but the property names and patterns do.

This is fundamentally why re-using schemas that use additionalProperties is so problematic for many people. It comes up most when it's set to false, but the underlying problem affects any use of a non-default additionalProperties.

Note that required, minProperties, and maxProperties do not have to be carried into the factored-out additionalProperties schema, as they do not impact the behavior of additionalProperties. Issue #65 proposes also removing any property named in required from the set of properties affected by additionalProperties, which would mean that required would also need to be carried along in the factored-out schema. This is why I oppose #65.

Re-use and additionalProperties

In the thread about use cases for JSON Schema re-use, we agreed that there have been several use cases motivating various attempts to get around this algebraic limitation. A simple example of what's desired is that people want this:

{
    "type": "object",
    "allOf": [
        {"properties": {"foo": {"type": "boolean"}}, "additionalProperties": false},
        {"properties": {"bar": {"type": "number"}}, "required": ["bar"]}
    ]
}

to be equivalent to

{
    "type": "object",
    "properties": {"foo": {"type": "boolean"}, "bar": {"type": "number"}},
    "additionalProperties": false,
    "required": ["bar"]
}

Instead, that allOf combination can never validate- the foo schema forbids any property other than "foo", while the bar schema requires the property "bar".

Note that an analogous problem occurs if additionalProperties is set to any non-blank schema. For instance, if it was set to {"type": "string"} it would still be an impossible schema as that would be applied to "bar" which already required its value to be a number.

patternProperties behaves in an analogous way, so I mostly will ignore it to keep things (relatively) simple.

Solving the problem with a pre-processing step

I propose to solve this with a keyword, $combine, indicating a pre-processing step:

  • It should start with a $ to indicate that, like $ref, there is something structural going on
  • The keyword simply applies the sort of factoring-out of additionalProperties shown at the beginning of this proposal
  • As a pre-processing step, it can easily be published separately, outside of the JSON Schema spec proper
  • It preserves the context-free validation property by ensuring that all schemas can be programmatically converted to schemas that validate in a context-free manner

Example conversion

Here is the conversion, replacing allOf with $combine. After preprocessing, this:

{
    "type": "object",
    "$combine": [
        {"properties": {"foo": {"type": "boolean"}}, "additionalProperties": false},
        {"properties": {"bar": {"type": "number"}}, "required": ["bar"]}
    ]
}

becomes this:

{
    "type": "object",
    "allOf": [
        {"properties": {"foo": {"type": "boolean"}}},
        {"properties": {"bar": {"type": "number"}}, "required": ["bar"]},
        {"properties": {"foo": true, "bar": true}, "additionalProperties": false}
    ]
}

The only change to the first two entries is removing additionalProperties. The new third entry just collects the various properties (and patternProperties if they were present) with blank schemas, and includes them with the factored-out "additionalProperties": false.

The same thing would happen when considering our alternate example with "additionalProperties": {"type": "string"}. The first two would look exactly as they do above, and the third would look the same except for {"type": "string"} in place of false.

Difficulty and optional-ness

Making these conversions is easy in the simple case. Even when there are child schemas, it's pretty easy to implement:

{
    "$combine": [
        {"properties": {
            "foo": {"properties": {"bar": X}}
        }},
        {"properties": {
            "foo": {"properties": {"bar": Y}}
        }}
    ]
}

is equivalent to:

{
    "properties": {
        "foo": {"properties": {"bar": {"$combine": [X, Y]}}}
    }
}

$combine gets challenging when combining complex schemas that themselves involve boolean keywords. Implementing $combine in such a situation requires implementing a kind of "schema algebra". Requiring this of all implementations seems excessively burdensome. As with format, I'd recommend specifying this but leaving it up to implementations whether to support all or part of the feature. In particular, an implementation may want to implement $combine except not support combining schemas that contain boolean keywords.

Here are the challenges for the boolean keywords:

allOf

Since we are converting to an allOf already, and boolean AND is associative, this is straightforward. It's just more branches to combine but the mechanism is the same. So {"$combine": [A, "allOf": [B, C]]} is the same as {"$combine": [A, B, C]}

anyOf

An anyOf needs to be factored out, as the collection of properties and patternProperties should not draw from all branches, but only from one at a time. So {"$combine": [A, "anyOf": [B, C]]} becomes {"anyOf": [{"$combine": [A, B]}, {"$combine": [A, C]}]}, which reduces the problem to our existing rules.

not

This gets a little tricky. The fundamental principle with not is that:

{
    "not": {
        "properties": {"foo": {"type": "number"}},
        "patternProperties": {"^x-": {"type": "string"}},
        "additionalProperties": {"type": "boolean"}}
    }
}

is equivalent to

{
    "not": {
        "allOf": [
            {"properties": {"foo": {}}},
            {"patternProperties": {"^x-": {}}},
            {
                "properties": {"foo": {}},
                "patternProperties": {"^x-": {}},
                "additionalProperties": {"type": "boolean"}
            }
        ]
    }
}

which, by DeMorgan's law, is equivalent to:

{
    "anyOf": [
        {"properties": {"foo": {"not": {"type": "number"}}}},
        {"patternProperties": {"^x-": {"not": {"type": "string"}}}},
        {
            "properties": {"foo": true},
            "patternProperties": {"^x-": true},
            "additionalProperties": {"not": {"type": "boolean"}}
        }
    ]
}

The not doesn't affect the validation of the presence or absence of the properties that are named or matched by a pattern. It only affects the validation of child values.

NOTE: Fully covering all of the things that can happen with not is complicated, and may require adding some new keywords like minAdditionalProperties to make possible in all cases. I'll elaborate on these issues if there is ever momentum behind this proposal. They are all solvable.

oneOf

As {"oneOf": [A, B, C]} is equivalent to:

{
    "anyOf": [
        {"allOf": [A, {"not": B}, {"not": C}]},
        {"allOf": [{"not": A}, B, {"not": C}]},
        {"allOf": [{"not": A}, {"not": B}, C]}
    ]
}

It can be handled from this transformed point with the already established rules.

dependencies

Name dependencies are unaffected, but when dealing with schema dependencies we have to handle the conditional nature of how the dependent schemas are applied. This involves observing that:

{
    "dependencies": {"foo": X},
    Y
}

Where Y is the rest of the schema containing dependencies, is equivalent to:

{
    "oneOf": [
        {
            "required": ["foo"],
            "allOf": [X, Y]
        },
        {
            "properties": {"foo": false},
            X
        }
    ]
}

This reduces the problem to a combination of oneOf and allOf, which can therefore be reduced to a combination of allOf, anyOf, and not, which gets us back to our known rules.

Conclusion

This is, as far as I can tell, the only way to get this behavior while staying within the philosophical boundaries of JSON Schema.

I am not personally convinced that JSON Schema should directly incorporate any of the proposed "solutions" for the "additionalProperties": false "problem", but @epoberezkin requested in #15 that this be filed for the record.

I strongly prefer this (plus #98 ) over introducing arbitrary JSON transformations as is done by #15 , but I would also be fine with rejecting both proposals (or even all three proposals including #98 ). All three proposals could be implemented separately and used either as a build step, or as an extended vocabulary indicated by another meta-schema.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions