Skip to content

Updated tests #37

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Updated tests #37

wants to merge 1 commit into from

Conversation

epoberezkin
Copy link
Collaborator

No description provided.

@epoberezkin
Copy link
Collaborator Author

@ebdrup added new tests and updated meta-schema to match with the changes in JSON-Schema org (https://github.com/json-schema-org/json-schema-spec/blob/dba92b702c94858162f653590230e7573c8b7dd0/schema.json)

Quite a few validators failed additional tests...

@epoberezkin epoberezkin requested a review from ebdrup April 6, 2019 20:11
@ebdrup
Copy link
Owner

ebdrup commented Apr 7, 2019

You should make the new changed test not required to be included in the benchmark, so the list of tested schema validators remains the same.

What has actually changed in the spec? To me it seems weird that they would change an old spec like that.

@ebdrup
Copy link
Owner

ebdrup commented Apr 7, 2019

Oh @epoberezkin BTW, thanks a lot for the contribution. Could you make the PR in a way where all the reposts changing are not part of the PR. Then I can run the benchmarks myself? That would also make it a lot easier to review the change.

@@ -28,12 +28,10 @@
"type": "object",
"properties": {
"id": {
"type": "string",
"format": "uri"
"type": "string"
Copy link

@korzio korzio Apr 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@epoberezkin
Can you elaborate please, what's this change about? I see it breaks some tests
Thanks!

@epoberezkin
Copy link
Collaborator Author

epoberezkin commented Apr 8, 2019

What has actually changed in the spec? To me it seems weird that they would change an old spec like that.

@ebdrup The spec didn't really change - only the tests were added to provide a better coverage to the spec.

Can you elaborate please, what's this change about? I see it breaks some tests
Thanks!

@korzio This is the draft-04 meta-schema that was updated in json-schema org, as it wasn't correct previously. Schema ID should be uri-reference, but this format was only added in the later versions of the spec, so "format" keyword was just removed. "uri" format is much stricter than the requirement to "id". Not sure how any tests can fail because of removing it - Could you clarify which tests failed?

@epoberezkin
Copy link
Collaborator Author

Could you make the PR in a way where all the reposts changing are not part of the PR. Then I can run the benchmarks myself? That would also make it a lot easier to review the change.

@ebdrup There were two commits - the first is the change, the second - the updated reports. I removed the second.

@korzio
Copy link

korzio commented Apr 9, 2019

@epoberezkin Thanks for comment, will have a look at my package

@korzio
Copy link

korzio commented Apr 10, 2019

@epoberezkin Do you know, are there plans to release a new npm package version of JSON-Schema-Test-Suite - https://github.com/json-schema-org/JSON-Schema-Test-Suite#nodejs
?
My validator tests depend on this package right now, and there are bunch of changes in the release branch
json-schema-org/JSON-Schema-Test-Suite@86f965e...15ba997

@epoberezkin
Copy link
Collaborator Author

I use the submodule - probably worth raising an issue there to auto-release all merged changes

@korzio
Copy link

korzio commented May 7, 2019

korzio/djv#78

@ebdrup
Copy link
Owner

ebdrup commented Sep 18, 2019

I can't merge this until this is addressed:
You should make the new changed test not required to be included in the benchmark, so the list of tested schema validators remains the same.

@epoberezkin
Copy link
Collaborator Author

You should make the new changed test not required to be included in the benchmark, so the list of tested schema validators remains the same

@ebdrup Not quite sure why it should be the requirement? And not validators fixing the tests?

Maybe the better approach could be to change the benchmark completely to simply show the number of failed tests but run the benchmark on some more realistic data examples:

  • large objects
  • small objects
  • long arrays
  • short arrays
  • large maps
  • deep objects
  • recursive data structures (trees) - some validators would not support it but it can be made optional

Because the performance of the validator on test-suite that has 50% of tests that are supposed to fail is not really indicative of real world performance. Error generation is more expensive in terms of performance.

Performance only matters for server side validation, not so much for client-side. Server-side the validation passes most of the time (unless there is application mistake), because if it validates user input in most cases it is complemented by client side validation based on the same schema. So the cost of generating the errors should not affect the benchmark I think.

What do you think?

@ebdrup
Copy link
Owner

ebdrup commented Jan 18, 2020

I think it's a good idea. The reason the test suite was chosen as data for the benchmark, was that if some feature was implemented very poorly (performance wise) it would show. And also it was simply less work. That would still be semi-true if more tests were ignored in the benchmark. At least all correctly implemented features across validators would be included. It's a bit of a pickle, if validators are not in the benchmark, I think the authors loose interest in fixing and improving.
Of corse the list does not have to be 100% the same, but most of the chart leaders should still be there, if someone using one of them is considering a switch.

@epoberezkin
Copy link
Collaborator Author

I am not suggesting to stop running the tests - they can be run to establish how many of them fail.

What I meant that the test suite is absolutely not indicative of real world server side performance, as in the test suite 50% of the tests are failing (and they are more expensive to run than passing tests) and in real world an absolute majority of validate data passes validation.

So, for the test suite compliance benchmark running the tests makes sense.

But for performance benchmark it does not make much sense for the reasons above.

What I suggested is that for performance benchmark only passing data should be used (or maybe at least 90% of data should be passing, although in real world server-side it’s over 99%, so using no failing data makes more sense for performance benchmark).

@epoberezkin
Copy link
Collaborator Author

So specifically we could have:

  • one large passing data object
  • 5-10 small passing (can be the same but run 5-10 times to balance large object performance) and one failing
  • one long passing array with mid-size objects
  • 5-10 short passing arrays and one failing small array
  • one large passing map
  • 5-10 small passing map and one failing small map
  • one deep large passing object
  • one large passing tree object (defined by self recursive schema) - it can be optional
  • one large passing tree object (defined by two mutually recursive schemas) - it should be optional

A bit of work to do, but it would make the benchmark reflect the real world performance and not performance in test suite.

@ebdrup
Copy link
Owner

ebdrup commented Jan 18, 2020 via email

@epoberezkin
Copy link
Collaborator Author

epoberezkin commented Jan 19, 2020

By “failing” I meant the tests that require that the data fails validation - from the test-suite point of view the tests of course pass. In real world validation scenarios on the server (where performance matters) data passes validation > 99% of the time, in the test suite the majority of the tests in order to pass require that the data fails validation.

@ebdrup
Copy link
Owner

ebdrup commented Jan 20, 2020 via email

@ebdrup
Copy link
Owner

ebdrup commented May 7, 2020

I'm currently working on an update here instead:#44
I'll close this PR.

I'm giving djv a chance to fix all the failing tests before I'll merge the PR.

@ebdrup ebdrup closed this May 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants