Release SPUContext Models · vngrs-ai/vnlp

SentencePiece Unigram Context (SPUContext) models are added for Named Entity Recognition, Dependency Parsing, Part of Speech Tagging and Sentiment Analysis. These are the default models now.
SPUContext models are even more compact, up to 4x faster and perform significantly better. See metrics table on the main page for comparison.
SPUContext models use SentencePiece Unigram tokenization.
Wheel file is 80% smaller now, and each model downloads its weights when it is initialized for the first time.
In order to evaluate a DL based model, use "evaluate = True" flag while initializing, e.g., NamedEntityRecognizer(model = 'CharNER', evaluate = True). This will load the weights that are NOT trained with test sets.
Former Python API has become a generic user API, creating an abstraction for the implemented methods. Desired model can be initialized using the "model" argument, e.g., NamedEntityRecognizer(model = 'CharNER').

Provide feedback