Skip to content
This repository has been archived by the owner on Jan 27, 2025. It is now read-only.

Consolidate path traversal #170

Merged
merged 63 commits into from
Apr 8, 2022
Merged

Consolidate path traversal #170

merged 63 commits into from
Apr 8, 2022

Conversation

fsteeg
Copy link
Member

@fsteeg fsteeg commented Feb 28, 2022

This is my suggestion to fix #102.

Based on the already merged extraction of FixPath (#133), here are some high-level changes:

  • Make FixMethod interact with record, which interacts with FixPath (instead of creating FixPaths everywhere)
  • Consolidate inserting values: replace append/replace methods with insertInto usage
  • Implement transform methods based on findIn and insertInto to resolve the main issue of Consolidate path traversal. #102
  • For cases where the findIn path does not work with insertInto (like title-? or some.*.nested.*), use a new Value#path field containing the full path to each value (created by Metafix#flattener)

This fixes some things that didn't work before (and changes the behavior for some things that did not work before and still don't work). See tests diff and issues mentioned in the commit messages.

I think this is an improvement over the current master state, and a good foundation for moving forward. Depending on the current, concrete requirements from our projects, this could be:

  • Further improve behavior (e.g. handling of repeated fields, wildcards) by revisting the issues that mention #102
  • Look into performance improvements (improvements in findIn/insertInto will benefit all transformations)
  • Refactoring this to get it closer to the Catmandu terminology (e.g. new FixPath(field, hash).get())

Causes minor behavior changes (see new @MetafixToDo annotations)

Contains comments sketching possible ways to consolidate, basically
on two levels: 1) implement transform/insert/remove in terms of find
(find does all path traversal, then the result is changed / deleted)
2) move hash / array methods for each operation into hash / array

Short-term next step I'd look into: remove / unify InsertMode usage
(hence the somewhat odd null / default InsertMode in `insertInto`)
Replace replace*/append* methods with insert* and InsertMode usage

Use get / set / add / remove methods on Record for interaction
For cases where the findIn path does not work with insertInto
(like `title-?` or `some.*.nested.*`), use a new Value#path field
Copy link
Member

@blackwinter blackwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry but there's just too much going on here. c25ede3 breaks our Limetrans workflow in a bunch of different ways (see below) and I'm officially giving up. Also, there seem to be only a few To Dos resolved, but more added. Not really the outcome we were hoping for, is it? (Integration test results as of 6d74bca: 3 fixed, 2 broken.)

So here are just some preliminary remarks and questions, this is not the final assessment.


Behaviour changes observed in Limetrans Alma mapping:

  1. It's unclear where they're coming from but we get null values in records now: Error while executing Fix expression [...]: vacuum(), caused by java.lang.IllegalStateException: Expected Array or Hash or String, got null.
  2. uniq() does not work anymore due to the inclusion of path in Value.equals().
  3. lookup() behaviour seems to have changed unexpectedly (see also Add delete:-Attribute to lookup and change the default filter-modus to an optional configuration #149).
  4. Entity boundaries are violated (probably due to some error/change w.r.t. $append).

@blackwinter blackwinter assigned fsteeg and unassigned blackwinter Mar 2, 2022
@fsteeg
Copy link
Member Author

fsteeg commented Mar 4, 2022

Thank you for your review and for helping me make sense of all of this @blackwinter!

I've pushed my current state, which addresses many of the issues you brought up.

I'll continue with the remaining ones (capitalize & append/prepend).

Copy link
Member

@blackwinter blackwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks for the update.

@fsteeg
Copy link
Member Author

fsteeg commented Mar 7, 2022

Changed the prepend/append/capitalize/trim tests to the original expected behavior and tweaked the implementation accordingly in be764e4.

Thanks for the additional feedback, addressed it as far as I could for now in 8b27a52.

@fsteeg fsteeg requested a review from blackwinter March 7, 2022 12:02
@fsteeg fsteeg assigned blackwinter and unassigned fsteeg Mar 7, 2022
@blackwinter blackwinter assigned fsteeg and unassigned blackwinter Mar 7, 2022
@blackwinter
Copy link
Member

Tweaked MetafixScriptTest for order issue in 4961b95.

The behaviour has changed and that should be reflected in the test: 0dfcfb8.

The remaining issue in MetafixScriptTest seems to have the same cause as 7b74937 (not throwing on transform of non-string field).

No, both shouldSkipExpressionOnExecutionException and shouldSkipRecordOnExecutionException are throwing an exception, but should react differently.

blackwinter added a commit to hbz/Catmandu that referenced this pull request Apr 6, 2022
@fsteeg fsteeg requested a review from blackwinter April 8, 2022 13:29
@fsteeg fsteeg assigned blackwinter and unassigned fsteeg Apr 8, 2022
Copy link
Member

@blackwinter blackwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The moment of truth, huh? 😟 😨 😰

(@MetafixToDo: 23 -> 29, todo.txt: 86 -> 72)

@blackwinter blackwinter assigned fsteeg and unassigned blackwinter Apr 8, 2022
@fsteeg
Copy link
Member Author

fsteeg commented Apr 8, 2022

blackwinter approved these changes

\o/

(@MetafixToDo: 23 -> 29, todo.txt: 86 -> 72)

But we fixed the regressions, these are additional tests. More tests (even failing) are good, no?

Looking at passing tests, I'm very happy: 353 -> 401 (unit), 141 -> 155 (integration).

@fsteeg fsteeg merged commit 15e2237 into master Apr 8, 2022
@fsteeg fsteeg deleted the 102-path branch April 8, 2022 14:54
@blackwinter
Copy link
Member

But we fixed the regressions, these are additional tests. More tests (even failing) are good, no?

Yes, I was just tying into the previous remarks from way back when this discussion started ;)

blackwinter added a commit to hbz/limetrans that referenced this pull request Apr 8, 2022
…cture-fix#170)

Differences in reference files:

- Various (inconsequential) order changes.
blackwinter added a commit that referenced this pull request Apr 11, 2022
…c42874)

IOW, don't split field names received from metadata stream (`Metafix.addValue()`).
@blackwinter
Copy link
Member

Found another regression after all :(

Tests and (partial) fix in 1ecbcdf.

blackwinter added a commit to hbz/limetrans that referenced this pull request Apr 14, 2022
…cture-fix#170)

Differences in reference files:

- Various (inconsequential) order changes.
blackwinter added a commit to hbz/limetrans that referenced this pull request Apr 26, 2022
…cture-fix#170)

Differences in reference files:

- Various (inconsequential) order changes.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consolidate path traversal.
2 participants