Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: alternative shape for lists #158

Merged
merged 7 commits into from
Apr 12, 2024
Merged

perf: alternative shape for lists #158

merged 7 commits into from
Apr 12, 2024

Conversation

giacomociti
Copy link
Contributor

for some reason, the current shapes for lists (and also those in shacl-shacl) cause perf issues to the shacl library, which fails with big lists (in the order of the thousands elements).

For example, using this constraint:

$ barnard59 cube fetch-constraint --endpoint https:/int.lindas.admin.ch/query --cube https://energy.ld.admin.ch/elcom/electricityprice > constraint.ttl

and trying to validate it:

barnard59 shacl validate --shapes validation/standalone-constraint-constraint.ttl < constraint.ttl

we get the error: Maximum call stack size exceeded.

The error disappears with the shapes proposed in this PR.

I wanted to add an approval test but I gave up because, besides the main meaningful validation messages, the report includes many confusing result details (the same happens also with the existing list shapes).

This is a simplified repro with an unexpected focusNode in the result, maybe related to a known issue

Copy link

changeset-bot bot commented Mar 25, 2024

🦋 Changeset detected

Latest commit: e3ce57c

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
cube-link Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@giacomociti giacomociti requested a review from tpluscode March 25, 2024 13:38
@tpluscode
Copy link
Contributor

cc @Rdataflow

Copy link
Contributor

@tpluscode tpluscode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test case with some broken lists

@giacomociti giacomociti requested a review from tpluscode March 27, 2024 09:30
@Rdataflow
Copy link
Contributor

@giacomociti this allows to check very long lists 🎉 🎉 🎉

@tpluscode IIUC with this change we can bump the list length to ~ 5000 in cube-creator, correct? (... of course you'd add some CI tests to confirm it actually works as expected 👍)

@Rdataflow
Copy link
Contributor

Rdataflow commented Mar 27, 2024

@giacomociti I observe some odd behaviour and results using

barnard59 cube fetch-constraint --endpoint https:/test.lindas.admin.ch/query --cube https://energy.ld.admin.ch/elcom/electricityprice > constraint_test.ttl
barnard59 shacl validate --shapes validation/standalone-constraint-constraint.ttl < constraint_test.ttl | barnard59 shacl report-summary

resulting in

warn: 13 results with severity http://www.w3.org/ns/shacl#Warning {"timestamp":"2024-03-27T10:05:49.952Z"}
error: 29 violations found {"stack":"Error: 29 violations found\n    at Object.<anonymous> (file:///C:/Users/Z70ATBE/node_modules/barnard59-shacl/lib/report.js:35:18)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async nextAsync (node:internal/streams/from:182:33)","timestamp":"2024-03-27T10:05:50.062Z"}
Warning of NodeConstraintComponent: "The use of qudt:unit is deprecated, please use qudt:hasUnit instead" with path <http://www.w3.org/ns/shacl#property> at focus node <https://energy.ld.admin.ch/elcom/electricityprice/shape> (source: _:b88_b630)
Violation of NodeConstraintComponent: "Value does not have shape <https://cube.link/shape/standalone-constraint-constraint#listnode>" with path null at focus node <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> (source: _:b88_b671)
Violation of MinCountConstraintComponent: "list node needs exactly one rdf:first" with path <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> at focus node <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> (source: _:b88_b674)
Violation of MinCountConstraintComponent: "list node needs exactly one rdf:rest" with path <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> at focus node <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> (source: _:b88_b675)
Violation of EqualsConstraintComponent: "Must have same values as <http://qudt.org/schema/qudt/hasUnit>" with path <http://qudt.org/schema/qudt/unit> at focus node _:b88_b88_b88_genid2d5510cd0133f64e44a42fe88022dd4e2a2d0b4361 (source: _:b88_b631)
Warning of NodeConstraintComponent: "The use of qudt:unit is deprecated, please use qudt:hasUnit instead" with path <http://www.w3.org/ns/shacl#property> at focus node <https://energy.ld.admin.ch/elcom/electricityprice/shape> (source: _:b88_b630)
Violation of EqualsConstraintComponent: "Must have same values as <http://qudt.org/schema/qudt/hasUnit>" with path <http://qudt.org/schema/qudt/unit> at focus node _:b88_b88_b88_genid2d5510cd0133f64e44a42fe88022dd4e2a2d0b4362 (source: _:b88_b631)
Warning of NodeConstraintComponent: "The use of qudt:unit is deprecated, please use qudt:hasUnit instead" with path <http://www.w3.org/ns/shacl#property> at focus node <https://energy.ld.admin.ch/elcom/electricityprice/shape> (source: _:b88_b630)
Violation of EqualsConstraintComponent: "Must have same values as <http://qudt.org/schema/qudt/hasUnit>" with path <http://qudt.org/schema/qudt/unit> at focus node _:b88_b88_b88_genid2d5510cd0133f64e44a42fe88022dd4e2a2d0b4363 (source: _:b88_b631)
[etc...]

while obviously the validation fails due to missing qudt:hasUnit there are oddly 3 violations on lists ... which seem non-related but oddly disappear once qudt:hasUnit is manually added to the constraint_test.ttl

Violation of NodeConstraintComponent: "Value does not have shape <https://cube.link/shape/standalone-constraint-constraint#listnode>" with path null at focus node <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> (source: _:b88_b671)
Violation of MinCountConstraintComponent: "list node needs exactly one rdf:first" with path <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> at focus node <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> (source: _:b88_b674)
Violation of MinCountConstraintComponent: "list node needs exactly one rdf:rest" with path <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> at focus node <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> (source: _:b88_b675)

thanks for having a look at this situation

@giacomociti
Copy link
Contributor Author

@Rdataflow the strange behavior disappears using sh:xone (which is even more precise). We will open an issue to investigate the validator

Copy link
Contributor

@tpluscode tpluscode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be good here but we need to fix the out of place sh:details...

sh:path qudt:unit ;
sh:equals qudt:hasUnit ;
] ;
sh:or (
Copy link
Contributor

@Rdataflow Rdataflow Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@giacomociti good catch - replace with sh:xone 💯

nb: see also #162 - introduces xone in other situations

@giacomociti giacomociti merged commit 3f40fd5 into main Apr 12, 2024
25 checks passed
@giacomociti giacomociti deleted the list-shapes branch April 12, 2024 09:52
tpluscode added a commit that referenced this pull request Apr 12, 2024
* switch to `xone` where appropriate

* build(deps): bump express from 4.18.2 to 4.19.2

Bumps [express](https://github.com/expressjs/express) from 4.18.2 to 4.19.2.
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/master/History.md)
- [Commits](expressjs/express@4.18.2...4.19.2)

---
updated-dependencies:
- dependency-name: express
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* build(deps-dev): bump undici from 5.28.3 to 5.28.4

Bumps [undici](https://github.com/nodejs/undici) from 5.28.3 to 5.28.4.
- [Release notes](https://github.com/nodejs/undici/releases)
- [Commits](nodejs/undici@v5.28.3...v5.28.4)

---
updated-dependencies:
- dependency-name: undici
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* build(deps): bump tar from 6.2.0 to 6.2.1

Bumps [tar](https://github.com/isaacs/node-tar) from 6.2.0 to 6.2.1.
- [Release notes](https://github.com/isaacs/node-tar/releases)
- [Changelog](https://github.com/isaacs/node-tar/blob/main/CHANGELOG.md)
- [Commits](isaacs/node-tar@v6.2.0...v6.2.1)

---
updated-dependencies:
- dependency-name: tar
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* Create sour-cooks-cheer.md

* test: update approved results

* perf: alternative shape for lists (#158)

* perf: alternative shape for lists

* Create friendly-pumas-melt.md

* test malformed lists

* list validation: use xone

* refactor: remove warning about valid `qudt:hasUnit`

* adjust approval tests

* fix typo

---------

Co-authored-by: Tomasz Pluskiewicz <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Tomasz Pluskiewicz <[email protected]>
Co-authored-by: Giacomo Citi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants