Impl test report #540

gregsdennis · 2022-01-08T21:42:38Z

Based on a conversation with @karenetheridge in Slack, this is the first step toward building an implementation comparison/support site.

The scripts in this PR will run a given implementation against the entire test suite and generate a report. This report can be used to generate site content.

Still have some work to do:

automatically run on commit to master (GitHub Actions, probably)
run optional tests

…scripts

Julian · 2022-01-08T21:52:18Z

Woohoo! This will be awesome. Will have a look tomorrow or so, but thanks this is a big deal.

karenetheridge · 2022-01-09T00:25:12Z

report/README.md

+
+To add your implementation, you'll need to start by creating a CLI application that takes the following parameters:
+
+- `--schema <PATH#POINTER>` - The relative URI to the schema, e.g. `../tests/draft2019-09/additionalItems.json#/0/schema`.


The implication here is that the instance and schema are expected to be files in JSON format. Are any other encoding formats allowed or expected to be supported? (e.g. YAML?) Must these be files on the local filesystem, or are network-reachable URIs possibly going to be used as well?

e.g. is this legal? --schema https://json-schema.org/draft/2020-12/schema

These URIs point directly to the file and schema/instance inside of it. The docker-compose is copying the test suite as it exists in this repo directly into the container. Your CLI will need to find the file, open it, and navigate to the location indicated by the pointer fragment. The resulting JSON data will be the schema or instance as appropriate.

e.g. is this legal? --schema https://json-schema.org/draft/2020-12/schema

I expect that could be accepted by your CLI, but the test suite won't submit that. It'll only submit schemas that exist in the test suite.

These URIs point directly to the file and schema/instance inside of it. The docker-compose is copying the test suite as it exists in this repo directly into the container. Your CLI will need to find the file, open it, and navigate to the location indicated by the pointer fragment.

Great, please add that to the README.

I had intended to add a "how it works" section that would include this.

karenetheridge · 2022-01-09T00:33:33Z

What do you think about my idea of passing the data and schemas to the implementation directly via STDIN? This may make it easier for you to generate randomized tests on the fly (and prevent the consuming implementation from potentially cheating, which it can do if it is allowed to see the entire file that the data comes from and therefore see what the correct valid/invalid response should be).

gregsdennis · 2022-01-09T01:32:15Z

What do you think about my idea of passing the data and schemas to the implementation directly via STDIN?

Your CLI can support whatever, but it MUST support what I've laid out. If you want to make it multipurpose so that you could use it outside of this test suite, that's fine.

The idea I'm presenting here is that we want to exercise the implementation against the tests in this repo. If a dev wants to make their CLI do more, then that's up to them. Such extra functionality won't be exercised here.

Regarding cheating, the CLI isn't given the metadata of the tests. It's only given the schema and the data, and from that it has to provide an outcome. The script puts the test metadata (including the expected result) and the actual result into the result file.

karenetheridge · 2022-01-09T04:04:18Z

If you want to make it multipurpose so that you could use it outside of this test suite, that's fine.

Thats not what I'm saying. I'm suggesting that if you add to the requirements that the implementation accepts data on STDIN, it would make it easier to supply a wider variety of test data, and on an ad hoc basis.

It'll only submit schemas that exist in the test suite.

It shouldn't limit itself to that.

Regarding cheating, the CLI isn't given the metadata of the tests

It is, if you provide a filename and a reference point within it that is obviously corresponding to the standard layout of the files in the test suite. If this is what you require, I can provide you with a tool that does nothing but look to a 'valid' property adjacent to 'data' and provide that response.

That's why I think only providing data via files is a mistake -- it is restricting too strongly what structure is used for testing, which means the tool's use is limited.

gregsdennis · 2022-01-09T06:04:14Z

it would make it easier to supply a wider variety of test data, and on an ad hoc basis.

For the purposes of this runner, that's not necessary. This runner will only ever run the tests present in the suite.

If you're intent is to just see how implementations might behave with a new scenario before you add it to the suite, that could be done locally easily enough.

It shouldn't limit itself to that.

Why not? We only have the tests that are present in these files.

If this is what you require, I can provide you with a tool that does nothing but look to a 'valid' property adjacent to 'data' and provide that response.

I think requiring this tells the implementors that we don't trust them. It's not really the signal we want to send. I see no real problem with pointing the CLI at the test case within the files.

Truthfully, an enterprising cheater could hard-code all the test cases into the tool with their expected result, which would still give a 100%.

I'm not really worried about cheating. If we're concerned, we can ask to see the source for the tool.

gregsdennis · 2022-01-09T06:04:44Z

Additionally, this is a first draft at a runner. It doesn't need to be perfect.

Julian · 2022-01-09T13:09:41Z

Personally I don't think we need to spend any effort on making sure an implementation isn't gaming the runner, such a thing is likely to be quickly discovered (by the implementation's users) and the consequences will be bad. It's just a high risk no reward to me.

Julian · 2022-01-09T13:12:32Z

Additionally, this is a first draft at a runner. It doesn't need to be perfect.

Of course agree with this, but another thing that might be useful for a second pass, which I've already been thinking about, is that I'd like to have a JSON-based file format which represents skips of tests in the suite. I.e. it should be possible to state in JSON what tests from the suite you want to run (across required and optional) and what you want to flag as skipped (or known failing). Today I do this in my implementation in Python code, but I even on my own eventually will move to JSON in my implementation, because it's easier to work with -- standardizing such a thing though may make things easier for this runner (even if we don't formally standardize it, and just invent something for this repo).

karenetheridge · 2022-01-09T23:36:00Z

Additionally, this is a first draft at a runner. It doesn't need to be perfect.

Yup, absolutely! I do think it would be a useful feature to have though, and it might be easier if we allow for that from the start rather than updating Docker containers multiple times.

If you're intent is to just see how implementations might behave with a new scenario before you add it to the suite, that could be done locally easily enough.

Yes, being able to generate tests programmatically would be very useful. The test data could still be written out to a file as an intermediary, but that seems like an unnecessary step when we could just stream them.

Even if the data comes from a file, we should not presume that the structure is identical to the existing test suite -- and providing examples that give json pointers that exactly match the test suite structure does imply that. How about the example saying something like --schema test.json#/0/schema --data test.json#/0/data instead?

Truthfully, an enterprising cheater could hard-code all the test cases into the tool with their expected result, which would still give a 100%.

Yes, that's one reason why generated tests are useful.

I'm not really worried about cheating. If we're concerned, we can ask to see the source for the tool.

It's not about cheating specifically (although we have talked about this in the past, which is when we first started talking about randomly generating tests and feeding them to implementations blind, last year), but also about generating a wider variety of tests that we may not want to commit to the test suite directly. If you recall, I submitted a more comprehensive set of tests last year (PR #385) and they were rejected because of the size, with the suggestion that we could programmatically generate these tests and send them directly to implementations when such a capability was available. I'd still like to be able to do that.

gregsdennis · 2022-01-10T03:18:28Z

it should be possible to state in JSON what tests from the suite you want to run (across required and optional) and what you want to flag as skipped (or known failing)

I would suggest we devise some kind of indexing system. It could help with the report as well. The index for a given test would need to remain consistent as the suite changes.

providing examples that give json pointers that exactly match the test suite structure does imply that

The pointers submitted to the CLI don't need to match the test files. The runner is reading from the files, so in this case, they do. The pointer could specify anywhere.

How about the example saying something like --schema test.json#/0/schema --data test.json#/0/data instead?

There's nothing that's preventing this from working. Any file and any location within it is fine.

the suggestion that we could programmatically generate these tests and send them directly to implementations

We're not doing this right now. I want to focus on what we're doing. When we figure out how to generate tests, we can have people update their CLI tools. This is the more iterative approach and will get us running something faster.

Julian · 2022-01-14T14:19:13Z

I'll give a shot to writing a hook for my own implementation this weekend I think, may have some feedback on the implementer-facing instructions after doing so (though they seem good already).

Julian · 2022-02-03T14:25:04Z

report/README.md

+To add your implementation, you'll need to start by creating a CLI application that takes the following parameters:
+
+- `--schema <PATH#POINTER>` - The relative URI to the schema, e.g. `../tests/draft2019-09/additionalItems.json#/0/schema`.
+- `--instance <PATH#POINTER>` - The relative URI to the data instance, e.g. `../tests/draft2019-09/additionalItems.json#/0/tests/0/data`.


I think it'd be easier (and more future compatible) if the interface was JSON in and out, rather than CLI arguments, no?

Specifically, implementers wouldn't need to do their own URI resolution if we instead had the runner harness pipe a JSON object with schema and instance in it to their stdin.

And on the result side, we may at some point need information back from the CLI, so rather than using exit code, maybe we should instead immediately use {"result": "valid"} and {"result": "invalid"}, this way at some point we could have {"result": "invalid", "known_bug": "myrepo/issues/123"} or whatever.

Doing it this way would also mean we could specify that CLIs should accept JSON terminated by line delimiters so we don't pay startup cost of each CLI repeatedly.

Doing it this way would also mean we could specify that CLIs should accept JSON terminated by line delimiters so we don't pay startup cost of each CLI repeatedly.

We'd have to minimize all of the test files to remove line delimiters. I'm not sure I can do that in a bash script.

Given that I'm learning as I go, all of this is a tall order.

Do you mean newlines?
Two NL in a row could be used to signal the end of a json blob, rather than removing all newlines and passing the data as a single line.

Julian · 2022-02-03T14:25:56Z

report/README.md

+- `--schema <PATH#POINTER>` - The relative URI to the schema, e.g. `../tests/draft2019-09/additionalItems.json#/0/schema`.
+- `--instance <PATH#POINTER>` - The relative URI to the data instance, e.g. `../tests/draft2019-09/additionalItems.json#/0/tests/0/data`.
+- `--spec-version <VERSION>` - The draft/version for the schema, e.g. `draft2020-12`.  The value for this parameter will match a folder name under `/tests/` in this repo.
+<!-- Do we need this parameter? -->


I can't see why we'd need it, so maybe kill it until someone asks for it, though adding it would be easier with the above.

We may need to to configure the optional format tests. Some validators might not have that enabled and we could use this as a signal to the implementation to enable it.

For draft 2020-12 we could use a custom format-required meta-schema, but that wouldn't work for the other drafts.

For an initial proof-of-concept I'd leave out all optional tests and stick to just what the specification requires.

Later on, we'll have to think about what config options would be required (and how to signal them to the application) for the various option tests, as there's a mixed bag of a bunch of them in there. (or, take the opportunity to rethink our organization of optional tests, since this keeps coming up as a point of confusion and we're being rather informal about the whole thing right now.) e.g. some tests require format validation to be supported and turned on; some tests require bignum support; some tests require non-standard regex syntax, etc. We may also want to add more optional tests later on for testing annotation collection, or the operation of the content* keywords, etc etc.

Julian · 2022-02-03T14:31:58Z

report/README.md

+
+- `docker-image` - The name of your docker image.  The `latest` tag will be used.
+- `command` - The command to run your CLI.
+- `versions` - An array of versions/drafts supported by your implementations.  (To save time, unsupported versions will be skipped.)


It seems like this may cause a rush of PRs when new drafts are released as each person adds the new draft to the file.

Maybe instead we indeed should send the draft version, and specify that the implementer's CLI needs to return an exit status that indicates to skip the draft if it doesn't recognize it.

That sounds good to me. It would be nice for things to adapt automatically as we add a new spec version or an implementation extends support to a new version in their 'latest' image.

Julian · 2022-02-03T14:33:07Z

report/implementations.json

+[
+    {
+        "docker-image": "gregsdennis/json-everything",
+        "command": "dotnet json-everything.dll",


This should probably be an array to avoid the off-chance of needing shell quoting.

I don't understand.

(I deleted my previous "I agree that this doesn't make sense" comment, after thinking about it harder) :)

If a command is provided as a single string, then the implication is it's passed to the shell, which may parse some characters unless they're escaped; if the command is passed as an array of words, then we can bypass the shell entirely and just hand it to exec (or equivalent).

e.g. perl's exec and system functions pass the command through the shell if one argument is provided, but skips that step if multiple arguments are provided, with the presumption that tokenization has already been done (and without having to fuss with escaping characters that the shell might do something with). https://perldoc.perl.org/functions/exec

Okay. Again. I'm learning as I go here, so feel free to slow am example or add a commit. I don't know how to translate a string array into a bash command.

That's what I was referring to yeah -- and no worries at all! I can leave some commits showing what I mean -- it's true, shell scripting indeed doesn't make some of this trivial.

Julian · 2022-11-08T15:44:37Z

@gregsdennis are you OK with me closing this as superceded by Bowtie (which does this now at least as much as the PR did)? (Thanks, you def get a ton of the credit for finally pushing for this to happen...)

gregsdennis · 2022-11-08T17:57:38Z

Yep. Bowtie looks great.

gregsdennis added 11 commits January 9, 2022 10:22

empty fragments are valid pointers per RFC 6901 sec.6

dbf1c09

add report runner script; iterate through tests and call test runner cli

372381d

better flow and use of piping, no base64

80b971d

process response a bit; output messages

583c133

set up docker-compose to run suite (WIP)

7d984dc

integrate implementations file; add dockerfile to include jq; update …

257384b

…scripts

generate report

81ac6a5

updated implementations file to use json-everything image from dockerhub

2e4bc37

fix sed command to support slashes in docker image names

159ed24

add a readme for the runner

d20cd1d

revert test changes

23dbed6

gregsdennis requested review from jdesrosiers, Julian and karenetheridge January 8, 2022 21:43

fix command check; add optional tests

7215368

karenetheridge reviewed Jan 9, 2022

View reviewed changes

karenetheridge mentioned this pull request Jan 9, 2022

docker image for test suite reporting karenetheridge/JSON-Schema-Modern#65

Open

Julian reviewed Feb 3, 2022

View reviewed changes

Julian mentioned this pull request Jun 28, 2022

remove conditional $id tests from draft 6 #562

Merged

gregsdennis closed this Nov 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Impl test report #540

Impl test report #540

gregsdennis commented Jan 8, 2022 •

edited

Loading

Julian commented Jan 8, 2022

karenetheridge Jan 9, 2022 •

edited

Loading

karenetheridge Jan 9, 2022

gregsdennis Jan 9, 2022

gregsdennis Jan 9, 2022

karenetheridge Jan 9, 2022

gregsdennis Jan 9, 2022

karenetheridge commented Jan 9, 2022

gregsdennis commented Jan 9, 2022 •

edited

Loading

karenetheridge commented Jan 9, 2022

gregsdennis commented Jan 9, 2022

gregsdennis commented Jan 9, 2022

Julian commented Jan 9, 2022

Julian commented Jan 9, 2022 •

edited

Loading

karenetheridge commented Jan 9, 2022

gregsdennis commented Jan 10, 2022

Julian commented Jan 14, 2022

Julian Feb 3, 2022 •

edited

Loading

Julian Feb 3, 2022

gregsdennis Feb 7, 2022

karenetheridge Feb 7, 2022

Julian Feb 3, 2022

gregsdennis Feb 7, 2022

karenetheridge Feb 7, 2022

Julian Feb 3, 2022

karenetheridge Feb 7, 2022

Julian Feb 3, 2022

gregsdennis Feb 7, 2022

karenetheridge Feb 7, 2022

gregsdennis Feb 7, 2022

Julian Feb 7, 2022

Julian commented Nov 8, 2022

gregsdennis commented Nov 8, 2022


		To add your implementation, you'll need to start by creating a CLI application that takes the following parameters:

		- `--schema <PATH#POINTER>` - The relative URI to the schema, e.g. `../tests/draft2019-09/additionalItems.json#/0/schema`.

Impl test report #540

Impl test report #540

Conversation

gregsdennis commented Jan 8, 2022 • edited Loading

Julian commented Jan 8, 2022

karenetheridge Jan 9, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karenetheridge commented Jan 9, 2022

gregsdennis commented Jan 9, 2022 • edited Loading

karenetheridge commented Jan 9, 2022

gregsdennis commented Jan 9, 2022

gregsdennis commented Jan 9, 2022

Julian commented Jan 9, 2022

Julian commented Jan 9, 2022 • edited Loading

karenetheridge commented Jan 9, 2022

gregsdennis commented Jan 10, 2022

Julian commented Jan 14, 2022

Julian Feb 3, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Julian commented Nov 8, 2022

gregsdennis commented Nov 8, 2022

gregsdennis commented Jan 8, 2022 •

edited

Loading

karenetheridge Jan 9, 2022 •

edited

Loading

gregsdennis commented Jan 9, 2022 •

edited

Loading

Julian commented Jan 9, 2022 •

edited

Loading

Julian Feb 3, 2022 •

edited

Loading