Replies: 1 comment
-
All string distance functions inherit from the StringComparator base class. This means that they should implement a method called So an implementation of the Jaccard similarity in a file stringcompare/distance/jaccard.py should look something like this: from .comparator import StringComparator
from ..preprocessing.tokenizer import Tokenizer, WhitespaceTokenizer
class Jaccard(StringComparator):
def __init__(self, tokenizer = WhitespaceTokenizer()):
self.tokenizer = tokenizer
def compare(self, s, t):
# TODO: implement computation of the Jaccard similarity function between strings `s` and `t` Then you'd be able to use the jaccard similarity function as follows: from stringcompare.distance.jaccard import Jaccard
comparator = Jaccard()
comparator("A first sentence to compare to a second", "A second sentence to compare to the first") You can fork the StringCompare repositories to make these changes and test them out. Then we'll be able to review the code in a pull request. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I was wondering about the implementation of Methods such as Jaccard Similarity or Hamming Distance. I saw in previous python string distance methods that you use a class as well as a string comparator object. Me and Garrett were wondering about how the class with the String comparator related to the overall workflow of the StringCompare package.
Specifically, for example, the lines 18-26 within Jarowinkler.py
Beta Was this translation helpful? Give feedback.
All reactions