Releases: KennethEnevoldsen/scandinavian-embedding-benchmark
v0.9.2
v0.9.1
v0.9.1 (2024-01-26)
Fix
-
fix: ran swednsts and reduced dataset size (
6c0a030
) -
fix: ensure that metrics is correctly formatted from MTEB (
b5873b8
)
Unknown
- Merge pull request #100 from KennethEnevoldsen/sts_vs_retrieval
Reduce size of SwednSTS (c2c32b7
)
-
fixed type hints (
0a5fd27
) -
Merge branch 'main' of https://github.com/KennethEnevoldsen/scandinavian-embedding-benchmark into sts_vs_retrieval (
141ed73
) -
Merge pull request #89 from KennethEnevoldsen/stuff_runs_tests
Added integration test for four model types (f427a6d
)
- Merge pull request #92 from KennethEnevoldsen/custom_embeddings
Custom embeddings for E5 and Cohere + Interface changes to accomodate this (aeb32ce
)
-
Reset test cases incorrectly overwritten by a merge conflict resolution (
ccdd886
) -
Merge branch 'custom_embeddings' of https://github.com/KennethEnevoldsen/scandinavian-embedding-benchmark into custom_embeddings (
2414337
) -
Put resetting the model's encode() method to a finally clause (
34a2612
) -
Merge branch 'main' into custom_embeddings (
18cd858
) -
Removed debugging print statements from E5 (
cc12070
) -
Made EmbeddingModel into a dataclass instead of BaseModel (
75debec
) -
Removed reference to MTEBTask from ScaLA (
674a005
) -
Replaced MTEBTaskModel with partial() (
79708bf
) -
Added encode_queries and encode_documents to EmbeddingModel, made task optional (
ecee037
) -
Merge branch 'main' into stuff_runs_tests (
01e8979
) -
Moved models to @parametrize (
fafcb39
)
v0.9.0
v0.9.0 (2024-01-26)
Feature
- feat: Added performance metrics for danfever (
22eb72b
)
Unknown
- Merge pull request #97 from KennethEnevoldsen/add-danfever
Add danFEVER (801753f
)
-
appease pyright (
a572962
) -
tests: remove tests which has to be changed when adding new datasets (
04aa44e
) -
tests: convert test_task back to normal (
be2c071
) -
Merge branch 'main' of https://github.com/KennethEnevoldsen/scandinavian-embedding-benchmark into add-danfever (
69a5a03
)
v0.8.0
v0.8.0 (2024-01-25)
Ci
- ci: fix mispecified yaml syntax (
ca5567c
)
Documentation
Feature
Fix
-
fix: fixed error arised from merge (
11e28d6
) -
fix: updated based on static type checks (
4752f07
) -
fix: move description to the end as to make printing of task object prettier (
f8ec70d
) -
fix: reduced size of SwednClustering and ensure that clusters match with document size (
0b70730
)
Style
- style: ran linting (
05b6bf9
)
Test
-
test: Performance using 5x2048 examples is 8.13 (
ed5cb5d
) -
test: Performance using 5x10000 examples is 13.80 (
ed36b82
) -
test: Performance using 2x10000 examples is 8.70 (
6fe30b7
) -
test: Performance using 10000 examples is 8.46 (
630769c
) -
test: Performance using 1000 examples is 8.12 (
7732c32
) -
test: Performance using 100 examples is 21.07 (
82f7b3f
)
Unknown
- Merge pull request #96 from KennethEnevoldsen/add-swedn-clustering
Add Swedn and VG clustering datasets (8537e12
)
-
Merge branch 'main' of https://github.com/KennethEnevoldsen/scandinavian-embedding-benchmark into add-swedn-clustering (
18f9afb
) -
tests: refactored tests to not be highly dependent on a few tasks (
4b1eaa5
) -
Added a bunch of experiments for the vg summerization. (
d9a13cb
) -
Merge pull request #90 from KennethEnevoldsen/types
Moved task types to task interface and deleted types module (7c3b582
)
-
Added English to Language type (
221bdd8
) -
Removed faulty import in E5 models (
601002c
) -
Merge pull request #91 from KennethEnevoldsen/new_models
Added Jina base (95c515e
)
v0.7.1
v0.7.1 (2024-01-23)
Fix
- fix: added task argument to TranslateE5 encoding (
71dcd09
)
Unknown
- Merge pull request #87 from KennethEnevoldsen/new_models
Added XLM-Roberta large and LaBSE (74fcf43
)
- Merge pull request #85 from x-tabdeveloping/main
Added FastText and Translate-E5 models (2d9043e
)
-
Removed commented-out lines (
373b937
) -
Removed duplicate model (
476d679
) -
Fixed duplicate model names (
7275686
) -
Added XLM-Roberta large and LaBSE (
858db1b
) -
Merging upstream into the branch so that it contains the fixed E5 models, that pass along the task. (
f6f71db
) -
TranslateE5 now uses E5Wrapper to ensure task-correct embeddings and prefixes. (
5eff1fc
) -
Translate now returns a single string instead of a list (
e1cefd9
)
v0.7.0
v0.6.0
v0.5.5
v0.5.5 (2024-01-22)
Fix
- fix: Add toggle for verbosity on the cli and remove duplicate entries in table (
4d26fce
)
Unknown
- Merge pull request #74 from KennethEnevoldsen/verbosity_for_cli
Fix verbosity toggle on CLI and remove duplicate entries in table (99ef0f2
)
- Remove model results for repo (
2435011
)
v0.5.4
v0.5.4 (2024-01-22)
Fix
- fix: ScaLA now correctly wraps models to allow for task argument to be passed (
3b07a4d
)
Unknown
- Merge branch 'main' of https://github.com/KennethEnevoldsen/scandinavian-embedding-benchmark (
07efe8f
)