Skip to content

Releases: KennethEnevoldsen/scandinavian-embedding-benchmark

v0.9.2

26 Jan 16:21
Compare
Choose a tag to compare

v0.9.2 (2024-01-26)

Ci

  • ci: Updated lint workflow to actually fail when not linted (d1c177c)

Fix

  • fix: Added relevant type ignores (27b8dd3)

Unknown

  • Merge pull request #104 from KennethEnevoldsen/ci-lint

Ci lint (ade460b)

  • fixing linting error (1b75a4f)

  • introduce intentional lint error (d861935)

v0.9.1

26 Jan 15:02
Compare
Choose a tag to compare

v0.9.1 (2024-01-26)

Fix

  • fix: ran swednsts and reduced dataset size (6c0a030)

  • fix: ensure that metrics is correctly formatted from MTEB (b5873b8)

Unknown

  • Merge pull request #100 from KennethEnevoldsen/sts_vs_retrieval

Reduce size of SwednSTS (c2c32b7)

Added integration test for four model types (f427a6d)

  • Merge pull request #92 from KennethEnevoldsen/custom_embeddings

Custom embeddings for E5 and Cohere + Interface changes to accomodate this (aeb32ce)

  • Reset test cases incorrectly overwritten by a merge conflict resolution (ccdd886)

  • Merge branch 'custom_embeddings' of https://github.com/KennethEnevoldsen/scandinavian-embedding-benchmark into custom_embeddings (2414337)

  • Put resetting the model's encode() method to a finally clause (34a2612)

  • Merge branch 'main' into custom_embeddings (18cd858)

  • Removed debugging print statements from E5 (cc12070)

  • Made EmbeddingModel into a dataclass instead of BaseModel (75debec)

  • Removed reference to MTEBTask from ScaLA (674a005)

  • Replaced MTEBTaskModel with partial() (79708bf)

  • Added encode_queries and encode_documents to EmbeddingModel, made task optional (ecee037)

  • Merge branch 'main' into stuff_runs_tests (01e8979)

  • Moved models to @parametrize (fafcb39)

v0.9.0

26 Jan 08:49
Compare
Choose a tag to compare

v0.9.0 (2024-01-26)

Feature

  • feat: Added performance metrics for danfever (22eb72b)

Unknown

  • Merge pull request #97 from KennethEnevoldsen/add-danfever

Add danFEVER (801753f)

v0.8.0

25 Jan 15:31
Compare
Choose a tag to compare

v0.8.0 (2024-01-25)

Ci

  • ci: fix mispecified yaml syntax (ca5567c)

Documentation

  • docs: formatting code blocks (cee41f3)

  • docs: update docs to not run all models (90cef3d)

Feature

  • feat: Added VG clustering dataset (49e75d5)

  • feat: Add swedn clustering (0786ec5)

Fix

  • fix: fixed error arised from merge (11e28d6)

  • fix: updated based on static type checks (4752f07)

  • fix: move description to the end as to make printing of task object prettier (f8ec70d)

  • fix: reduced size of SwednClustering and ensure that clusters match with document size (0b70730)

Style

Test

  • test: Performance using 5x2048 examples is 8.13 (ed5cb5d)

  • test: Performance using 5x10000 examples is 13.80 (ed36b82)

  • test: Performance using 2x10000 examples is 8.70 (6fe30b7)

  • test: Performance using 10000 examples is 8.46 (630769c)

  • test: Performance using 1000 examples is 8.12 (7732c32)

  • test: Performance using 100 examples is 21.07 (82f7b3f)

Unknown

  • Merge pull request #96 from KennethEnevoldsen/add-swedn-clustering

Add Swedn and VG clustering datasets (8537e12)

Moved task types to task interface and deleted types module (7c3b582)

  • Added English to Language type (221bdd8)

  • Removed faulty import in E5 models (601002c)

  • Merge pull request #91 from KennethEnevoldsen/new_models

Added Jina base (95c515e)

  • Fixed import error in speed task (cfccbdf)

  • Added Jina base (6d1ec69)

  • Moved task types to task interface and deleted types module (2f1adf1)

v0.7.1

23 Jan 11:06
Compare
Choose a tag to compare

v0.7.1 (2024-01-23)

Fix

  • fix: added task argument to TranslateE5 encoding (71dcd09)

Unknown

  • Merge pull request #87 from KennethEnevoldsen/new_models

Added XLM-Roberta large and LaBSE (74fcf43)

  • Merge pull request #85 from x-tabdeveloping/main

Added FastText and Translate-E5 models (2d9043e)

  • Removed commented-out lines (373b937)

  • Removed duplicate model (476d679)

  • Fixed duplicate model names (7275686)

  • Added XLM-Roberta large and LaBSE (858db1b)

  • Merging upstream into the branch so that it contains the fixed E5 models, that pass along the task. (f6f71db)

  • TranslateE5 now uses E5Wrapper to ensure task-correct embeddings and prefixes. (5eff1fc)

  • Translate now returns a single string instead of a list (e1cefd9)

v0.7.0

23 Jan 07:30
Compare
Choose a tag to compare

v0.7.0 (2024-01-23)

Feature

  • feat: Added SwednRetrieval task

The idea is that it can be compared with SwednSTS to which one makes the most sense. (7fe3371)

Unknown

  • Merge pull request #82 from KennethEnevoldsen/add-retrieval-swedn

feat: Added SwednRetrieval task (d5f959d)

v0.6.0

22 Jan 10:56
Compare
Choose a tag to compare

v0.6.0 (2024-01-22)

Fix

  • fix: Allow models to batch inputs (09c3527)

Unknown

  • Merge pull request #70 from KennethEnevoldsen/add-speed-task

Added speed task (d192e44)

v0.5.5

22 Jan 10:49
Compare
Choose a tag to compare

v0.5.5 (2024-01-22)

Fix

  • fix: Add toggle for verbosity on the cli and remove duplicate entries in table (4d26fce)

Unknown

  • Merge pull request #74 from KennethEnevoldsen/verbosity_for_cli

Fix verbosity toggle on CLI and remove duplicate entries in table (99ef0f2)

  • Remove model results for repo (2435011)

v0.5.4

22 Jan 09:51
Compare
Choose a tag to compare

v0.5.4 (2024-01-22)

Fix

  • fix: ScaLA now correctly wraps models to allow for task argument to be passed (3b07a4d)

Unknown

v0.5.3

22 Jan 09:25
Compare
Choose a tag to compare

v0.5.3 (2024-01-22)

Fix

  • fix: ScaLA now correctly wraps models to allow for task argument to be passed Renamed scala cache (a70c950)

Unknown

  • Merge pull request #73 from KennethEnevoldsen/bug-scala-missing-task-encode-wrapper

Wraps ScaLA models in MTEBTaskModel (e2eee05)