Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Harvest throws error when it sees an Ingest LDD file #203

Open
mdrum opened this issue Nov 9, 2021 · 4 comments
Open

Harvest throws error when it sees an Ingest LDD file #203

mdrum opened this issue Nov 9, 2021 · 4 comments
Labels
bug Something isn't working icebox s.low

Comments

@mdrum
Copy link

mdrum commented Nov 9, 2021

🐛 Describe the bug

When running Harvest tool on one of our bundles that contained an Ingest_LDD XML file, the tool reported the following error:
[ERROR] Missing logical identifier: /Users/mdrum/.CMVolumes/tunneled sbn-archive/dsk1/www/archive/pds4/orex/orex.mission/xml_schema/orex_ldd.xml
This is not an error I would expect to see, because the file in question is not a label

📜 To Reproduce

Steps to reproduce the behavior:

  1. Download the OSIRIS-REx Mission Bundle https://sbnarchive.psi.edu/pds4/orex/orex.mission.zip
  2. Download the attached testcase.cfg.txt, but modify the bundle location to match your download location (and modify the extension)
  3. Install registry app, and set up an Elastic Search registry running locally on localhost:9200
  4. Run registry-manager create-registry
  5. Run harvest -c testcase.cfg

🕵️ Expected behavior

I would expect that the products within the bundle be processed without error, and the XML schema or Ingest LDD files be skipped

📚 Version of Software Used

Registry App 1.0.0

🩺 Test Data / Additional context

Provided in reproduction steps

🏞Screenshots

CleanShot 2021-11-09 at 15 32 48@2x

🖥 System Info

  • OS: MacOS 11.6

🦄 Related requirements

⚙️ Engineering Details

@mdrum mdrum added bug Something isn't working needs:triage labels Nov 9, 2021
@jordanpadams
Copy link
Contributor

jordanpadams commented Nov 9, 2021

@mdrum we will look into this, but I was not aware that XML labels could even be archived at this time? We have not built support into any of our tools to handle XML files in an archive. That was part of the whole lblx convo. I imagine validate balks at this as well?

@mdrum
Copy link
Author

mdrum commented Nov 10, 2021

@jordanpadams Yeah the LDD file isn't being archived, it just happens to be inside the bundle directory alongside the other dictionary artifacts. It would be nice if the tool was smart enough to ignore XML files that weren't labels, but another nice-to-have feature would be if there was some way to filter files within a bundle by path/filename (rather than filtering on product class)

@jordanpadams
Copy link
Contributor

@mdrum I like the latter idea. Are you thinking a regex capability for filtering? Or something like an ignores list?

@mdrum
Copy link
Author

mdrum commented Nov 10, 2021

@jordanpadams Regex would definitely be more useful, and maybe even begin to allow for some sort of parallelization. A filter for whitelisting or blacklisting by path and name would allow a job to be broken up into pieces.

@jordanpadams jordanpadams removed their assignment Nov 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working icebox s.low
Projects
None yet
Development

No branches or pull requests

2 participants