-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Vocab.oov_prob, fix lexeme_ lookups table creation #12242
Remove Vocab.oov_prob, fix lexeme_ lookups table creation #12242
Conversation
Remove `Vocab.oov_prob` and `Vocab.cfg` (both unused) and fix how `lexeme_prob` and `lexeme_cluster` tables are added dynamically when an attribute is set on a vocab where these tables don't already exist.
|
||
assert lex1 < lex2 | ||
assert lex2 > lex1 | ||
@pytest.mark.parametrize("word1,prob1,word2,prob2", [("NOUN", -1, "opera", -2)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original test did not make sense. lex1 < lex2
is comparing ORTH
and not PROB
. Replaced with a test for the lookups table creation.
This is meant to be a teeny step in the right direction in case these attributes are preserved in this format. It may not make sense to continue to keep the They would be a good candidate for a refactored |
Actually, I'm not sure this is helpful for |
You'll maybe hate me for saying this because it's indeed been pretty broken / messy. But: I don't feel great about having the Is there a version of this that lets us keep the oov probability as a value in the vocab, but handles it correctly? |
Temporarily closing this PR as we currently don't have the bandwidth to finish this. |
Description
Remove
Vocab.oov_prob
andVocab.cfg
(both unused) and fix howlexeme_prob
andlexeme_cluster
tables are added dynamically when an attribute is set on a vocab where these tables don't already exist.Types of change
?
Checklist