Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking β€œSign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a user, I want to exclude files from harvest run based upon directory path #218

Open
jordanpadams opened this issue Nov 14, 2024 Β· 0 comments

Comments

@jordanpadams
Copy link
Member

jordanpadams commented Nov 14, 2024 β€’

Checked for duplicates

No - I haven't checked

πŸ§‘β€πŸ”¬ User Persona(s)

Node Operator

πŸ’ͺ Motivation

...so that I can skip directories I do not want harvest to load

πŸ“– Additional Details

From user:

We have non-archive files in our service directories to support serving the data. Things like DOI landing pages, for example, and pre-made bulk download files. The the loader is going to assume that everything in a directory is either a PDS4 label or something pointed to by a PDS4 label, it's going to choke. We could make an exclusion list of directory names, file names, and file extensions to ignore, if that would help.

Acceptance Criteria

Given a bundle_root/ directory with sub-directories root/subdir1 and root/subdir2, all containing PDS4 XML products
When I perform harvest run with dataPath = bundle_root/ and excludePath = root/subdir2
Then I expect all the data from bundle_root/ and root/subdir1 to be loaded in to the Registry, but NOT data from root/subdir2

βš™οΈ Engineering Details

No response

πŸŽ‰ I&T

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ToDo
Status: ToDo
Development

No branches or pull requests

2 participants