Releases: dayyass/text-classification-baseline
Releases · dayyass/text-classification-baseline
v0.1.6
v0.1.5
Release v0.1.5 🥳🎉🍾
- added pymorphy2 lemmatization (#81)
- added token frequency support (#85)
- added threshold selection for binary classification (#86)
- added arbitrary save folder name (#83)
pymorphy2 lemmatization (config.yaml)
# preprocessing
# (included in resulting model pipeline, so preserved for inference)
preprocessing:
lemmatization: pymorphy2
token frequency support
text_clf.token_frequency.get_token_frequency(path_to_config)
-
get token frequency of train dataset according to the config file parameters
threshold selection for binary classification
text_clf.pr_roc_curve.get_precision_recall_curve(path_to_model_folder)
-
get precision and recall metrics for precision-recall curvetext_clf.pr_roc_curve.get_roc_curve(path_to_model_folder)
-
get false positive rate (fpr) and true positive rate (tpr) metrics for roc curvetext_clf.pr_roc_curve.plot_precision_recall_curve(precision, recall)
-
plot precision-recall curvetext_clf.pr_roc_curve.plot_roc_curve(fpr, tpr)
-
plot roc curvetext_clf.pr_roc_curve.plot_precision_recall_f1_curves_for_thresholds(precision, recall, thresholds)
-
plot precision, recall, f1-score curves for probability thresholds
arbitrary save folder name (config.yaml)
experiment_name: model