Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discard/replace "null" values by pattern. #346

Closed
wants to merge 2 commits into from

Conversation

blackwinter
Copy link
Member

@blackwinter blackwinter requested a review from fsteeg December 3, 2020 16:56
@blackwinter blackwinter self-assigned this Dec 3, 2020
@fsteeg
Copy link
Member

fsteeg commented Dec 7, 2020

Hm, when using this for the original use case in metafacture/metafacture-fix#34, I think having an actual regex pattern for the nullness is actually a bit of a stretch, as you mentioned yourself in metafacture/metafacture-fix#34 (comment). It also makes the original use case more complex (regex that matches only empty strings). While different systems can indicate nullness in different ways (null, "NULL", "nil", ""), each does so with some constant value, so I think the comparison should not be a pattern match, but an equality test.

@fsteeg fsteeg removed their request for review December 7, 2020 10:37
@blackwinter
Copy link
Member Author

Indeed, I contemplated a list of strings myself. But Flux doesn't support those.

Would you mind having to apply the filter multiple times in case there were multiple keywords to filter? Then we could implement a (varargs) keyword setter instead of a pattern setter.

It also makes the original use case more complex (regex that matches only empty strings).

How so? \A\z matches empty strings pretty easily, no?

@fsteeg
Copy link
Member

fsteeg commented Dec 7, 2020

Would you mind having to apply the filter multiple times in case there were multiple keywords to filter? Then we could implement a (varargs) keyword setter instead of a pattern setter.

I'm not sure what you mean here. What would that look like in a Flux?

I'm a bit confused in general. For the use case I saw in metafacture/metafacture-fix#34, I imagine a single config string that replaces null. So I want either filter-nulls(value: "") or filter-nulls(value: "NULL"), but not both in one workflow. Maybe I'm missing something here. Because also, I don't quite get what you are doing in the tests: your new literal value is literal-NULL, and that is filtered away. From my understanding of the use case, this is not correct, only fields that are exactly NULL should be filtered away.

How so? \A\z matches empty strings pretty easily, no?

Ha, so after researching \A\z I now know I would have fallen into the trap of simply using ^$. On the other side, filter-nulls(value: "") requires zero regex knowledge. Not saying it might not be worth having regex support here, we have it elsewhere in the framework. But it makes the original use case more complex, and I don't even see why we need it.

Finally, seeing all these configured filter-nulls, maybe we should reconsider using NullFilter for this at all, and create a proper, general usage StringLiteralFilter instead?

@blackwinter
Copy link
Member Author

blackwinter commented Dec 7, 2020

What would that look like in a Flux?

filter-nulls(keywords="null")|
filter-nulls(keywords="NULL")|

I'm a bit confused in general. For the use case I saw in metafacture/metafacture-fix#34, I imagine a single config string that replaces null. So I want either filter-nulls(value: "") or filter-nulls(value: "NULL"), but not both in one workflow. Maybe I'm missing something here.

It's just a preemptive measure on my part. YAGNI? Maybe. To me it just feels a little restrictive (if we're going that route at all), I certainly can imagine someone wanting a more flexible filter down the road.

Because also, I don't quite get what you are doing in the tests: your new literal value is literal-NULL, and that is filtered away. From my understanding of the use case, this is not correct, only fields that are exactly NULL should be filtered away.

It's a pattern. If I wanted to filter out only exact matches, I could have done so with \ANULL\z.

Finally, seeing all these configured filter-nulls, maybe we should reconsider using NullFilter for this at all, and create a proper, general usage StringLiteralFilter instead?

That's fine with me. If you prefer NullFilter, I have another proposal based on lists of keywords (values). Otherwise, let's just scrap this and implement StringLiteralFilter (filter-literals) instead.

@fsteeg
Copy link
Member

fsteeg commented Dec 8, 2020

Then we could implement a (varargs) keyword setter instead of a pattern setter.

filter-nulls(keywords="null") |
filter-nulls(keywords="NULL") |

Interesting, and with the keywords being a varargs argument, this will result in a single NullFilter object internally, adding the keywords, instead of filtering the stream twice (as it would with a single string argument)? Would that also work when these commands are not consecutive?

Otherwise, let's just scrap this and implement StringLiteralFilter (filter-literals) instead.

+1, I think that makes more sense, these filter-nulls which don't actually filter nulls seem wrong.

@blackwinter
Copy link
Member Author

Interesting, and with the keywords being a varargs argument, this will result in a single NullFilter object internally, adding the keywords, instead of filtering the stream twice (as it would with a single string argument)?

No, AFAIUI, the Flow builder would add two separate elements.

+1, I think that makes more sense, these filter-nulls which don't actually filter nulls seem wrong.

OK, closing then.

@blackwinter blackwinter closed this Dec 8, 2020
@blackwinter blackwinter deleted the filter-nulls-by-pattern branch December 8, 2020 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add module or function to delete all empty/null values
2 participants