Skip to content

Releases: lancedb/lance

v0.4.5 Preview private API for merging columns

04 May 18:28
Compare
Choose a tag to compare

Welcome @Mause as our newest contributor! Also, a big thank you for your work on the duckdb extension framework.

In this release we added a preview of the feature to do distributed column additions. This makes it possible to distribute Lance Fragments across nodes, add a new column to each Fragment, and then write a new Lance dataset version manifest with the updated schema and files.

What's Changed

New Contributors

Full Changelog: v0.4.4...v0.4.5

v0.4.4 Various bug fixes

25 Apr 20:53
Compare
Choose a tag to compare

#805 fixed an integer overflow bug in the plain decoder that resulted in high latency for Take (and consequently high latency for the vector search). We'll be adding continuous performance benchmarks soon to prevent issues like this from being released in the future.

We also fixed a gap in cosine similarity where the vectors does not line up perfectly with SIMD strides on the platform.

DiskANN progress is continuing. First milestone will be an in-memory version to support smaller datasets. A compressed, disk-based version will follow soon after that.

What's Changed

Full Changelog: v0.4.3...v0.4.4

v0.4.3 Bug fixes and code cleanup

20 Apr 06:16
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.4.2...v0.4.3

v0.4.2 Polars, GCS, and distributed lances

14 Apr 17:57
Compare
Choose a tag to compare

A warm welcome to @hzhang86 as Lance's newest contributor. Thanks for adding TPCH benchmarks for Lance to establish a baseline. This is really helpful for us to focus performance optimization roadmap.

This release is packed with valuable features:

  1. Direct polars scan without needing to pull everything into memory is added.
  2. We expose FileFragment's to allow distributed processing engines like Spark to access parts of a Lance dataset easily
  3. Last but not least, we've added support for reading Lance data directly from GS buckets

What's Changed

New Contributors

Full Changelog: v0.4.1...v0.4.2

v0.4.1 Support Append in Vector Search

05 Apr 21:30
Compare
Choose a tag to compare

The vector search in Lance now supports live updates. Previously, when you added new vectors to the dataset, you would be required to rebuild the index. Now, the index is "inherited" and the vector search results are the combination of ANN search on the indexed data and KNN on the new Appended data. So there's a small latency increase and the recall should be the same or better.

This provides a smooth performance curve until you have inserted enough new data that re-indexing is warranted.

What's Changed

Full Changelog: v0.4.0...v0.4.1

v0.4.0 Windows support

30 Mar 22:22
Compare
Choose a tag to compare

A warm welcome to @gsajko ! Thanks for making our tutorial notebook easier to use and understand!

Note: OPQ is disabled in windows for the vector index. This will be addressed once LAPACK support is added.

What's Changed

New Contributors

Full Changelog: v0.3.19...v0.4.0

v0.3.19 Bug fix for filter predicates on large-utf8 type

27 Mar 17:58
Compare
Choose a tag to compare

Also fix publishing to crates.io

What's Changed

  • Make contract clear for KNN nodes by @eddyxu in #729
  • Refactor Scan I/O plan by @eddyxu in #731
  • [Rust] Use folked sqlparser to unblock rust crate release by @eddyxu in #732
  • [Rust] Fix filter on large UTF8 columns by @eddyxu in #733

Full Changelog: v0.3.18...v0.3.19

v0.3.18 Bug fix release for binary offsets

24 Mar 07:45
Compare
Choose a tag to compare

Fix for incorrect offset for string/variable list columns as reported in #720 (comment)

Thanks @lucazanna for the feedback!

What's Changed

Full Changelog: v0.3.17...v0.3.18

v0.3.17 Support for nested dict columns

22 Mar 02:05
Compare
Choose a tag to compare

A warm welcome to @haoxins , a new contributor who has helped improve Lance documentation.

This release adds support for list-of-dict columns (thanks @lucazanna for reporting the bug in #715).

Also included in this release are various vector index improvements for scalability and more progress towards OPQ implementation.

What's Changed

New Contributors

Full Changelog: v0.3.16...v0.3.17

v0.3.16 Filte pushdown improvements

18 Mar 06:48
Compare
Choose a tag to compare

Welcome @wangfenjin to lance contributors. Thanks for submitting a bug fix for the Lance DuckDB extensions 🔥

This release contains 2 workarounds for arrow limitations:

  1. Lance datasets now support <field> LIKE '%' and <field> IN (<values>) filters to be passed in as string. Generic SQL syntax supported by datafusion is now accepted. This is a break from standard pyarrow Dataset behavior which only accepts arrow compute Expression, which is not present in rust and also does not support introspection in python for developers to build custom adapter.

  2. When concatenating arrow dictionary arrays, the dict values are duplicated. There is currently no concrete plans to change this behavior in Arrow. Instead, we fix that at write time in Lance.

What's Changed

New Contributors

Full Changelog: v0.3.15...v0.3.16