Releases: iterative/datachain
Releases · iterative/datachain
0.3.18
What's Changed
- Remove obsolete UDF code by @rlamy in #452
- added embeddings/gen example by @tibor-mach in #362
- update pytest-servers to 0.5.7 by @mattseddon in #454
- Introduce telemetry in datachain by @amritghimire in #411
- Replace
UniqueId
withFile
by @rlamy in #450 - Auto load json cols by @dberenbaum in #444
New Contributors
- @tibor-mach made their first contribution in #362
Full Changelog: 0.3.17...0.3.18
0.3.17
What's Changed
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #451
- remove legacy udf decorator by @mattseddon in #438
- Remove storage from dataset query and refactor related codebase by @ilongin in #367
Full Changelog: 0.3.16...0.3.17
0.3.16
What's Changed
- Move 'join' SQL implementation to warehouse by @dreadatour in #409
Full Changelog: 0.3.15...0.3.16
0.3.15
What's Changed
- Add resolve files by @EdwardLi-coder in #313
- unskip test_udf_parallel by @mattseddon in #432
- fix last modified comparison in resolve file test by @mattseddon in #436
- Refactor
Client.parse_url()
by @ilongin in #435 - Set stream for nested file signals by @dberenbaum in #443
- Read arrow files from cache by @dberenbaum in #442
- Auto-detect huggingface datasets when reading tabular data by @dberenbaum in #398
- Add
datachain.lib.tar.process_tar()
generator by @rlamy in #440 - Fix storage dependencies by @ilongin in #421
Full Changelog: 0.3.14...0.3.15
0.3.14
What's Changed
- fix dependency install instructions for examples by @mattseddon in #426
- Show progress bar for pytorch conversion by @dberenbaum in #429
- Fix calculating datasets stats size by @dreadatour in #418
- use the correct fixtures in tests by @mattseddon in #428
- Adding Complex Type Support to Signal Schema by @dtulga in #422
- tests: fix mock for subprocess stdout/stderr to return BytesIO by @skshetry in #431
- prevent tests from hanging on CI (windows) by @mattseddon in #427
- Remove Entry class and use File instead by @rlamy in #419
Full Changelog: 0.3.13...0.3.14
0.3.13
0.3.12
What's Changed
- Fixes settings by @dberenbaum in #397
- fix open file method for tar files by @dberenbaum in #412
- disable execution of last query expression by default by @skshetry in #407
New Contributors
- @yathomasi made their first contribution in #408
Full Changelog: 0.3.11...0.3.12
0.3.11
What's Changed
- query: remove use of pipe for communication by @skshetry in #393
- do not require last statement to be an expression or an instance of DatasetQuery by @skshetry in #395
- pin pydantic < 2.9 by @mattseddon in #399
- unpin pydantic, use python API for datamodel_codegen by @skshetry in #400
- Update the DataChain logo in the README and docs by @djsauble in #402
- avoid splitting script into feature files/scripts by @skshetry in #385
- allow merge on expressions by @mattseddon in #388
New Contributors
Full Changelog: 0.3.10...0.3.11
0.3.10
What's Changed
- Support for reading from huggingface hub with
hf://
filesystem by @dberenbaum in #375 - Simplify datachain.lib.listing by reusing Cilent.scandir() by @rlamy in #376
- Use stderr for sql debug prints by @shcheklein in #378
- Refactor
DataChain.from_storage()
to use new listing generator by @ilongin in #294 - remove unused finally block by @mattseddon in #379
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #382
- increase timeout of e2e test by @mattseddon in #383
- metrics: save metrics in realtime by @skshetry in #387
- query: remove support for saving dataset query with a given name by @skshetry in #389
- Using job class instead of hardcodced
Job
by @ilongin in #391 - cli: remove preview from
datachain query
command by @skshetry in #392 - fix issues with new version of huggingface datasets package by @mattseddon in #394
- Add
DataChain.listings()
method and use it in getting storages by @ilongin in #331
Full Changelog: 0.3.9...0.3.10
0.3.9
What's Changed
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #354
- increase timeout of datachain tests in CI (Windows) by @mattseddon in #363
- remove LaionMeta model store registration from wds example by @mattseddon in #364
- slight positioning change to deny AI abstractions by @volkfox in #356
- unstructured example - remove misleading install instructions by @mattseddon in #366
- improve datachain subtract by @EdwardLi-coder in #352
- Fixing get_file_signals for custom types by @dtulga in #371
Full Changelog: 0.3.8...0.3.9