Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impl test report #540

Closed
wants to merge 12 commits into from
Closed

Impl test report #540

wants to merge 12 commits into from

Conversation

gregsdennis
Copy link
Member

@gregsdennis gregsdennis commented Jan 8, 2022

Based on a conversation with @karenetheridge in Slack, this is the first step toward building an implementation comparison/support site.

The scripts in this PR will run a given implementation against the entire test suite and generate a report. This report can be used to generate site content.

Still have some work to do:

  • automatically run on commit to master (GitHub Actions, probably)
  • run optional tests

@Julian
Copy link
Member

Julian commented Jan 8, 2022

Woohoo! This will be awesome. Will have a look tomorrow or so, but thanks this is a big deal.


To add your implementation, you'll need to start by creating a CLI application that takes the following parameters:

- `--schema <PATH#POINTER>` - The relative URI to the schema, e.g. `../tests/draft2019-09/additionalItems.json#/0/schema`.
Copy link
Member

@karenetheridge karenetheridge Jan 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implication here is that the instance and schema are expected to be files in JSON format. Are any other encoding formats allowed or expected to be supported? (e.g. YAML?) Must these be files on the local filesystem, or are network-reachable URIs possibly going to be used as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. is this legal? --schema https://json-schema.org/draft/2020-12/schema

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These URIs point directly to the file and schema/instance inside of it. The docker-compose is copying the test suite as it exists in this repo directly into the container. Your CLI will need to find the file, open it, and navigate to the location indicated by the pointer fragment. The resulting JSON data will be the schema or instance as appropriate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. is this legal? --schema https://json-schema.org/draft/2020-12/schema

I expect that could be accepted by your CLI, but the test suite won't submit that. It'll only submit schemas that exist in the test suite.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These URIs point directly to the file and schema/instance inside of it. The docker-compose is copying the test suite as it exists in this repo directly into the container. Your CLI will need to find the file, open it, and navigate to the location indicated by the pointer fragment.

Great, please add that to the README.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had intended to add a "how it works" section that would include this.

@karenetheridge
Copy link
Member

What do you think about my idea of passing the data and schemas to the implementation directly via STDIN? This may make it easier for you to generate randomized tests on the fly (and prevent the consuming implementation from potentially cheating, which it can do if it is allowed to see the entire file that the data comes from and therefore see what the correct valid/invalid response should be).

@gregsdennis
Copy link
Member Author

gregsdennis commented Jan 9, 2022

What do you think about my idea of passing the data and schemas to the implementation directly via STDIN?

Your CLI can support whatever, but it MUST support what I've laid out. If you want to make it multipurpose so that you could use it outside of this test suite, that's fine.

The idea I'm presenting here is that we want to exercise the implementation against the tests in this repo. If a dev wants to make their CLI do more, then that's up to them. Such extra functionality won't be exercised here.

Regarding cheating, the CLI isn't given the metadata of the tests. It's only given the schema and the data, and from that it has to provide an outcome. The script puts the test metadata (including the expected result) and the actual result into the result file.

@karenetheridge
Copy link
Member

If you want to make it multipurpose so that you could use it outside of this test suite, that's fine.

Thats not what I'm saying. I'm suggesting that if you add to the requirements that the implementation accepts data on STDIN, it would make it easier to supply a wider variety of test data, and on an ad hoc basis.

It'll only submit schemas that exist in the test suite.

It shouldn't limit itself to that.

Regarding cheating, the CLI isn't given the metadata of the tests

It is, if you provide a filename and a reference point within it that is obviously corresponding to the standard layout of the files in the test suite. If this is what you require, I can provide you with a tool that does nothing but look to a 'valid' property adjacent to 'data' and provide that response.

That's why I think only providing data via files is a mistake -- it is restricting too strongly what structure is used for testing, which means the tool's use is limited.

@gregsdennis
Copy link
Member Author

it would make it easier to supply a wider variety of test data, and on an ad hoc basis.

For the purposes of this runner, that's not necessary. This runner will only ever run the tests present in the suite.

If you're intent is to just see how implementations might behave with a new scenario before you add it to the suite, that could be done locally easily enough.

It shouldn't limit itself to that.

Why not? We only have the tests that are present in these files.

If this is what you require, I can provide you with a tool that does nothing but look to a 'valid' property adjacent to 'data' and provide that response.

I think requiring this tells the implementors that we don't trust them. It's not really the signal we want to send. I see no real problem with pointing the CLI at the test case within the files.

Truthfully, an enterprising cheater could hard-code all the test cases into the tool with their expected result, which would still give a 100%.

I'm not really worried about cheating. If we're concerned, we can ask to see the source for the tool.

@gregsdennis
Copy link
Member Author

Additionally, this is a first draft at a runner. It doesn't need to be perfect.

@Julian
Copy link
Member

Julian commented Jan 9, 2022

Personally I don't think we need to spend any effort on making sure an implementation isn't gaming the runner, such a thing is likely to be quickly discovered (by the implementation's users) and the consequences will be bad. It's just a high risk no reward to me.

@Julian
Copy link
Member

Julian commented Jan 9, 2022

Additionally, this is a first draft at a runner. It doesn't need to be perfect.

Of course agree with this, but another thing that might be useful for a second pass, which I've already been thinking about, is that I'd like to have a JSON-based file format which represents skips of tests in the suite. I.e. it should be possible to state in JSON what tests from the suite you want to run (across required and optional) and what you want to flag as skipped (or known failing). Today I do this in my implementation in Python code, but I even on my own eventually will move to JSON in my implementation, because it's easier to work with -- standardizing such a thing though may make things easier for this runner (even if we don't formally standardize it, and just invent something for this repo).

@karenetheridge
Copy link
Member

Additionally, this is a first draft at a runner. It doesn't need to be perfect.

Yup, absolutely! I do think it would be a useful feature to have though, and it might be easier if we allow for that from the start rather than updating Docker containers multiple times.

If you're intent is to just see how implementations might behave with a new scenario before you add it to the suite, that could be done locally easily enough.

Yes, being able to generate tests programmatically would be very useful. The test data could still be written out to a file as an intermediary, but that seems like an unnecessary step when we could just stream them.

Even if the data comes from a file, we should not presume that the structure is identical to the existing test suite -- and providing examples that give json pointers that exactly match the test suite structure does imply that. How about the example saying something like --schema test.json#/0/schema --data test.json#/0/data instead?

Truthfully, an enterprising cheater could hard-code all the test cases into the tool with their expected result, which would still give a 100%.

Yes, that's one reason why generated tests are useful.

I'm not really worried about cheating. If we're concerned, we can ask to see the source for the tool.

It's not about cheating specifically (although we have talked about this in the past, which is when we first started talking about randomly generating tests and feeding them to implementations blind, last year), but also about generating a wider variety of tests that we may not want to commit to the test suite directly. If you recall, I submitted a more comprehensive set of tests last year (PR #385) and they were rejected because of the size, with the suggestion that we could programmatically generate these tests and send them directly to implementations when such a capability was available. I'd still like to be able to do that.

@gregsdennis
Copy link
Member Author

it should be possible to state in JSON what tests from the suite you want to run (across required and optional) and what you want to flag as skipped (or known failing)

I would suggest we devise some kind of indexing system. It could help with the report as well. The index for a given test would need to remain consistent as the suite changes.

providing examples that give json pointers that exactly match the test suite structure does imply that

The pointers submitted to the CLI don't need to match the test files. The runner is reading from the files, so in this case, they do. The pointer could specify anywhere.

How about the example saying something like --schema test.json#/0/schema --data test.json#/0/data instead?

There's nothing that's preventing this from working. Any file and any location within it is fine.

the suggestion that we could programmatically generate these tests and send them directly to implementations

We're not doing this right now. I want to focus on what we're doing. When we figure out how to generate tests, we can have people update their CLI tools. This is the more iterative approach and will get us running something faster.

@Julian
Copy link
Member

Julian commented Jan 14, 2022

I'll give a shot to writing a hook for my own implementation this weekend I think, may have some feedback on the implementer-facing instructions after doing so (though they seem good already).

To add your implementation, you'll need to start by creating a CLI application that takes the following parameters:

- `--schema <PATH#POINTER>` - The relative URI to the schema, e.g. `../tests/draft2019-09/additionalItems.json#/0/schema`.
- `--instance <PATH#POINTER>` - The relative URI to the data instance, e.g. `../tests/draft2019-09/additionalItems.json#/0/tests/0/data`.
Copy link
Member

@Julian Julian Feb 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be easier (and more future compatible) if the interface was JSON in and out, rather than CLI arguments, no?

Specifically, implementers wouldn't need to do their own URI resolution if we instead had the runner harness pipe a JSON object with schema and instance in it to their stdin.

And on the result side, we may at some point need information back from the CLI, so rather than using exit code, maybe we should instead immediately use {"result": "valid"} and {"result": "invalid"}, this way at some point we could have {"result": "invalid", "known_bug": "myrepo/issues/123"} or whatever.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing it this way would also mean we could specify that CLIs should accept JSON terminated by line delimiters so we don't pay startup cost of each CLI repeatedly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing it this way would also mean we could specify that CLIs should accept JSON terminated by line delimiters so we don't pay startup cost of each CLI repeatedly.

We'd have to minimize all of the test files to remove line delimiters. I'm not sure I can do that in a bash script.

Given that I'm learning as I go, all of this is a tall order.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean newlines?
Two NL in a row could be used to signal the end of a json blob, rather than removing all newlines and passing the data as a single line.

- `--schema <PATH#POINTER>` - The relative URI to the schema, e.g. `../tests/draft2019-09/additionalItems.json#/0/schema`.
- `--instance <PATH#POINTER>` - The relative URI to the data instance, e.g. `../tests/draft2019-09/additionalItems.json#/0/tests/0/data`.
- `--spec-version <VERSION>` - The draft/version for the schema, e.g. `draft2020-12`. The value for this parameter will match a folder name under `/tests/` in this repo.
<!-- Do we need this parameter? -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see why we'd need it, so maybe kill it until someone asks for it, though adding it would be easier with the above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may need to to configure the optional format tests. Some validators might not have that enabled and we could use this as a signal to the implementation to enable it.

For draft 2020-12 we could use a custom format-required meta-schema, but that wouldn't work for the other drafts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For an initial proof-of-concept I'd leave out all optional tests and stick to just what the specification requires.

Later on, we'll have to think about what config options would be required (and how to signal them to the application) for the various option tests, as there's a mixed bag of a bunch of them in there. (or, take the opportunity to rethink our organization of optional tests, since this keeps coming up as a point of confusion and we're being rather informal about the whole thing right now.) e.g. some tests require format validation to be supported and turned on; some tests require bignum support; some tests require non-standard regex syntax, etc. We may also want to add more optional tests later on for testing annotation collection, or the operation of the content* keywords, etc etc.


- `docker-image` - The name of your docker image. The `latest` tag will be used.
- `command` - The command to run your CLI.
- `versions` - An array of versions/drafts supported by your implementations. (To save time, unsupported versions will be skipped.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this may cause a rush of PRs when new drafts are released as each person adds the new draft to the file.

Maybe instead we indeed should send the draft version, and specify that the implementer's CLI needs to return an exit status that indicates to skip the draft if it doesn't recognize it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good to me. It would be nice for things to adapt automatically as we add a new spec version or an implementation extends support to a new version in their 'latest' image.

[
{
"docker-image": "gregsdennis/json-everything",
"command": "dotnet json-everything.dll",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be an array to avoid the off-chance of needing shell quoting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I deleted my previous "I agree that this doesn't make sense" comment, after thinking about it harder) :)

If a command is provided as a single string, then the implication is it's passed to the shell, which may parse some characters unless they're escaped; if the command is passed as an array of words, then we can bypass the shell entirely and just hand it to exec (or equivalent).

e.g. perl's exec and system functions pass the command through the shell if one argument is provided, but skips that step if multiple arguments are provided, with the presumption that tokenization has already been done (and without having to fuss with escaping characters that the shell might do something with). https://perldoc.perl.org/functions/exec

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Again. I'm learning as I go here, so feel free to slow am example or add a commit. I don't know how to translate a string array into a bash command.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I was referring to yeah -- and no worries at all! I can leave some commits showing what I mean -- it's true, shell scripting indeed doesn't make some of this trivial.

@Julian
Copy link
Member

Julian commented Nov 8, 2022

@gregsdennis are you OK with me closing this as superceded by Bowtie (which does this now at least as much as the PR did)? (Thanks, you def get a ton of the credit for finally pushing for this to happen...)

@gregsdennis
Copy link
Member Author

Yep. Bowtie looks great.

@gregsdennis gregsdennis closed this Nov 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants