16 Nov 04:23

changhiskhan

a55f929

v0.2.3 Bugfix release; breaks dataset proto schema

What's Changed

[C++] Project schema via field Ids and Schema intersection by @eddyxu in #305
when writing in batches, handle all na arrays properly by @changhiskhan in #306
[C++] Use LanceFragment to build I/O exec plan by @eddyxu in #307
[CI] Fix Github Action warning to upgrade nodejs 12 based actions by @eddyxu in #309
Update README.md by @changhiskhan in #310
Temporarily pin duckdb to 0.5.1 by @changhiskhan in #313
Notebook for new blog post on versioning by @changhiskhan in #311
[C++] Fix reading dictionary values from manifest files by @eddyxu in #314

Full Changelog: v0.2.2...v0.2.3

Contributors

eddyxu and changhiskhan

Assets 2

09 Nov 17:25

eddyxu

v0.2.2

8a9d736

v0.2.2 Python notebooks and CV dataset conversion.

What's Changed

[DOC] Update README.md by @jaichopra in #294
[DUCKDB] Script to upload lance extension zip by @changhiskhan in #295
[C++] Scan Node reads multiple files by @eddyxu in #300
[Python] Add lance.util.duckdb to help install the extension transparently by @changhiskhan in #301
[Python] Notebook fixes by @changhiskhan in #303
[Python] Make dataset conversion a feature by @changhiskhan in #304

Full Changelog: v0.2.1...v0.2.2

Contributors

eddyxu, changhiskhan, and jaichopra

Assets 2

04 Nov 22:41

changhiskhan

v0.2.1

70d72fb

v0.2.1 Bug fix release

Fixed bug affecting writes of fixed size list arrays as well as datagen code for Coco.
Updated to Arrow 10.0 newly released.

What's Changed

remove duplicate test_mac.sh by @changhiskhan in #284
Fix build on intel mac by @eddyxu in #286
[C++] Fix write fixed list array bug by @eddyxu in #288
Upgrade Apache Arrow to 10.0 by @eddyxu in #266
temporary hack to fix pytorch loader until it can handle a versioned … by @changhiskhan in #293
fix image_id alignment in coco datagen by @changhiskhan in #289

Full Changelog: v0.2.0...v0.2.1

Contributors

eddyxu and changhiskhan

Assets 2

02 Nov 19:38

eddyxu

v0.2.0

f4eaa21

v0.2.0 Dataset Versioning, DuckDB extension built with CUDA

Highlights

Lance Dataset versioning support
Duckdb Extension supports building against PyTorch with Cuda
Revamp README and documentation.

What's Changed

Fetch Dataset Versions by @eddyxu in #272
Readability improvement for metadata class by @Renkai in #275
[DuckDB] Enable DuckDB extension to build/run on CUDA-enabled PyTorch by @changhiskhan in #273
[Python] Support multi-versioned dataset by @eddyxu in #278
[Document] Add logo/README refresh by @jaichopra in #279
[Python] Fetch dataset versions. by @eddyxu in #280
[Python] Support group size and rows_per_file customization via write_dataset API by @eddyxu in #281
[Python] use new write API in python benchmark by @eddyxu in #282

Full Changelog: v0.1.5...v0.2.0

Contributors

eddyxu, changhiskhan, and 2 other contributors

Assets 2

28 Oct 16:52

eddyxu

v0.1.5

72e47ef

v0.1.5 Pandas Extension Type, Jupyter Notebook and Document Improvements

What's Changed

Add model inference notebook by @changhiskhan in #244
update README.md to simplify comms, prior to blog post by @jaichopra in #248
Jaichopra/rebrand lance by @jaichopra in #249
Exclude jupyter notebook from github language stats by @changhiskhan in #251
linguist fix by @changhiskhan in #253
restore skipped test since extension types are working on mac by @changhiskhan in #256
ingestion example by @changhiskhan in #252
Update README.md by @jaichopra in #261
[CI] pin arrow 9.0 in GHA by @eddyxu in #268
Update README.md by @jaichopra in #264
Pandas extension dtype for image by @changhiskhan in #267
Reorganize tutorial notebooks by @changhiskhan in #265
Merge two Schemas by @eddyxu in #263
Versioning support with Appending Dataset by @eddyxu in #262
Change datagen to use public https image urls by @changhiskhan in #271

Full Changelog: v0.1.4...v0.1.5

Contributors

eddyxu, changhiskhan, and jaichopra

Assets 2

16 Oct 17:18

eddyxu

v0.1.4

d7dfad6

v0.1.4: Var-length binary decoder performance improvements, Open Discord server for community.

What's Changed

CLI to inspect lance dataset by @eddyxu in #231
Generate primary key for Oxford Pet dataset by @eddyxu in #233
Fix datagen test by @eddyxu in #234
Add discord link and fix typo in README by @eddyxu in #236
Improve VarBinaryDecoder::Take performance by accumulating small batches by @eddyxu in #239

Full Changelog: v0.1.3...v0.1.4

Contributors

eddyxu

Assets 2

09 Oct 02:34

eddyxu

v0.1.3

41bdf4c

Document improvements and bug fixes

What's Changed

EDA Howtos by @eddyxu in #186
Apply Limit cross multiple files in the dataset. by @eddyxu in #226
Fix false assertion during BinaryEncoding by @eddyxu in #227

Full Changelog: v0.1.2...v0.1.3

Contributors

eddyxu

Assets 2

04 Oct 18:27

changhiskhan

v0.1.2

24a7988

v0.1.2

Lance now supports projection for nested column (e.g., "annotations.name")
There's also a fast path for CountRows to get the record count by looking at metadata
Finally, Lance now supports writing optional key-value metadata (pa.Table.schema.metadata)

What's Changed

GH release automation improvement by @changhiskhan in #212
Refactor benchmark and dataset generation by @changhiskhan in #205
install fsspec and s3fs by @changhiskhan in #213
Add ArrayFromJSON and TableFromJSON help functions for easy testing. by @eddyxu in #214
[C++] Merge two schemas by @eddyxu in #217
Support nested column projection by @eddyxu in #220
Implement CountRows for fast path of countings by @eddyxu in #221
Write schema metadata to Manifest by @eddyxu in #222

Full Changelog: v0.1.1...v0.1.2

Contributors

eddyxu and changhiskhan

Assets 2

29 Sep 05:13

changhiskhan

v0.1.1

d53ea14

v0.1.1

Fix up Mac wheel to enable extension types for MacOS

What's Changed

run quickstart notebook on every commit by @changhiskhan in #209
update README.md to reflect new URI and bindings by @jaichopra in #211
[Python] Exclude relocating packages in homebrew by @eddyxu in #210

New Contributors

@jaichopra made their first contribution in #211

Full Changelog: v0.1.0...v0.1.1

Contributors

eddyxu, changhiskhan, and jaichopra

Assets 2

27 Sep 23:07

changhiskhan

v0.1.0

c70a683

v0.1.0

Highlights

Documentation is now live and a Quickstart Notebook is available
Lance is now integrated with pytorch and supports multiple workers.
Vision-specific extension types like Box2d provides vectorized iou and Image types that make it easy to perform IO and go between bytes, PIL, numpy, and tensors.

What's Changed

Setting BatchSize via ScanBuilder by @eddyxu in #135
Move Expression based schema project to Schema class by @eddyxu in #137
Refactor I/O exec nodes by @eddyxu in #136
Simplify RecordBatchReader to use Project.next() by @eddyxu in #139
Convert bdd100k dataset in python benchmarks by @eddyxu in #131
Fix the condition of Scan advancing batch id by @eddyxu in #143
Initial PyTorch Dataset support by @eddyxu in #134
Example training code over oxford pet dataset by @eddyxu in #144
Test writing fixed size list and fixed size binary via WriteTable by @eddyxu in #151
Fix fixed size length calculation by @eddyxu in #152
Provide binary to profiling scans by @eddyxu in #149
Multi-worker support in Pytorch Dataset by @eddyxu in #147
Vision specific extension types by @changhiskhan in #146
lance dataset that overrides Dataset.scanner and Dataset.head by @changhiskhan in #158
Pickle Image by @changhiskhan in #160
Only load manifest once within the dataset and share Manifest amount the readers by @eddyxu in #155
Improve ergonomic of the Pytorch dataset and Generate embeddings for oxford pet by @eddyxu in #157
Fix PlainEncoder to read empty page by @eddyxu in #164
Convert coco annotations from the list of structs to struct of lists by @eddyxu in #166
Convert coco bounding box format to [x0,y0,x1,y1] format. by @eddyxu in #169
Image Array by @changhiskhan in #168
Fix writing and reading extension type by @eddyxu in #172
Coco improvements by @changhiskhan in #174
Support partitioning and group size control in coco dataset generation. by @eddyxu in #175
Extension type improvements to support 3d types by @changhiskhan in #173
Support converting PIL from Image in pytorch Dataset by @eddyxu in #176
Minor fix for 3d extension types by @changhiskhan in #177
MS coco dataset training by @eddyxu in #163
Change version import to relative import by @eddyxu in #181
[python] Mix of minor improvements by @changhiskhan in #182
Automatically build document and publish to Github Pages by @eddyxu in #180
[benchmarks] simplify the datagen code and remove partitioning for now by @changhiskhan in #183
Fix PlainDecoder handle empty filtered array by @eddyxu in #187
[python] minor improvements by @changhiskhan in #190
Fix bug that attempt to partitioned columns which does not exist in the file. by @eddyxu in #189
Pass filter indices via Limit and Return empty array in GetListArray by @eddyxu in #191
Exclude filter columns from projection by @eddyxu in #194
action to bump version for new release by @changhiskhan in #199
[C++] [BUG] Adjust offset when the batch size is set for reading by @eddyxu in #201
GH action to upload wheels and also make reusable yml by @changhiskhan in #200
[Python] Test projection in Python Torch Dataset by @eddyxu in #202
Fix typo of calculating offsets for slicing index by @eddyxu in #206
Changhiskhan/tutorial by @changhiskhan in #167

Full Changelog: v0.0.5...v0.1.0

Contributors

eddyxu and changhiskhan

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

Highlights

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

Highlights

What's Changed

Contributors

Releases: lancedb/lance

v0.2.3 Bugfix release; breaks dataset proto schema

What's Changed

Contributors

v0.2.2 Python notebooks and CV dataset conversion.

What's Changed

Contributors

v0.2.1 Bug fix release

What's Changed

Contributors

v0.2.0 Dataset Versioning, DuckDB extension built with CUDA

Highlights

What's Changed

Contributors

v0.1.5 Pandas Extension Type, Jupyter Notebook and Document Improvements

What's Changed

Contributors

v0.1.4: Var-length binary decoder performance improvements, Open Discord server for community.

What's Changed

Contributors

Document improvements and bug fixes

What's Changed

Contributors

v0.1.2

What's Changed

Contributors

v0.1.1

What's Changed

New Contributors

Contributors

v0.1.0

Highlights

What's Changed

Contributors