Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annoying Warning | process.extractOne | scorer=fuzz.token_sort_ratio | #422

Open
storesace-jorgelopes opened this issue Dec 10, 2024 · 5 comments

Comments

@storesace-jorgelopes
Copy link

Recently i changed from fuzzywuzzy library to rapidfuzz.
However on this code line i get the following warning:

from rapidfuzz import fuzz, process
best_match = process.extractOne(query=unidecode(col), choices=patterns, scorer=fuzz.token_sort_ratio, score_cutoff=80.0)

Unexpected type(s): (str, list[Union[str, Any]], Union[(s1: Sequence[Hashable], s2: Sequence[Hashable], Any, processor: None, score_cutoff: Optional[float]) -> float, (s1: _UnprocessedType1, s2: _UnprocessedType2, Any, processor: (Union[_UnprocessedType1, _UnprocessedType2]) -> Sequence[Hashable], score_cutoff: Optional[float]) -> float], float) Possible type(s): (Optional[Sequence[Hashable]], Mapping[_KeyType, Optional[_StringType2]], _Scorer[Sequence[Hashable], Sequence[Hashable], float, float], Optional[float]) (Optional[Sequence[Hashable]], Iterable[Optional[_StringType2]], _Scorer[Sequence[Hashable], Sequence[Hashable], float, float], Optional[float]) (Optional[str], Mapping[_KeyType, Optional[_StringType2]], _Scorer[str, _StringType2, _ResultType, _ResultType], Optional[float]) (Optional[str], Iterable[Optional[_StringType2]], _Scorer[str, _StringType2, _ResultType, _ResultType], Optional[float])

This warning only occurs because of scorer=fuzz.token_sort_ratio. If I remove it, no warning appears.
Can it be a variable typing bug on the function header?

@maxbachmann
Copy link
Member

Quite possible that there is an error in the type hints. Can you send me a complete sample that allows me to reproduce this warning? What's the tool you use for the type checking?

@storesace-jorgelopes
Copy link
Author

The sample to reproduce is the code i sent previously:

from rapidfuzz import fuzz, process
best_match = process.extractOne(query=unidecode(col), choices=patterns, scorer=fuzz.token_sort_ratio, score_cutoff=80.0)

I use PyCharm from JetBrains so i assume its the default type check that is being used.

@maxbachmann
Copy link
Member

I am asking because this is no complete sample. This sample would raise the following warnings:

  • unidecode doesn't exist
  • col doesn't exist
  • patterns doesn't exist

@storesace-jorgelopes
Copy link
Author

I see...
Here's another example:

best_match = process.extractOne(
query="val",
choices=["lote", "lot", "data validade", "val", "validade", "iec"],
scorer=fuzz.token_sort_ratio,
score_cutoff=80.0)

Unexpected type(s): (str, list[str], Union[(s1: Sequence[Hashable], s2: Sequence[Hashable], Any, processor: None, score_cutoff: Optional[float]) -> float, (s1: _UnprocessedType1, s2: _UnprocessedType2, Any, processor: (Union[_UnprocessedType1, _UnprocessedType2]) -> Sequence[Hashable], score_cutoff: Optional[float]) -> float], float) Possible type(s): (Optional[Sequence[Hashable]], Mapping[_KeyType, Optional[_StringType2]], _Scorer[Sequence[Hashable], Sequence[Hashable], float, float], Optional[float]) (Optional[Sequence[Hashable]], Iterable[Optional[str]], _Scorer[Sequence[Hashable], Sequence[Hashable], float, float], Optional[float]) (Optional[str], Mapping[_KeyType, Optional[_StringType2]], _Scorer[str, _StringType2, _ResultType, _ResultType], Optional[float]) (Optional[str], Iterable[Optional[str]], _Scorer[str, str, _ResultType, _ResultType], Optional[float])

@maxbachmann
Copy link
Member

I can reproduce the warning in pycharm. This works as expected in mypy and pyright. I can't see anything wrong with the type hints either. So I assume some of these more advanced type hinting features just don't work as expected in pycharm. If there is a way to provide different type hints for pycharm we could make them a lot more lax. However I am not aware of any way to detect that pycharm is used for the type checking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants