Releases: iterative/datachain
Releases · iterative/datachain
0.8.3
What's Changed
- Allow empty session name by @0x2b3bfa0 in #757
- fix(from_storage): no listing / cache on a single file by @shcheklein in #734
- fix(cache, client): support versioned files by @shcheklein in #753
- Fix signed url versioning by @shcheklein in #755
Full Changelog: 0.8.2...0.8.3
0.8.2
What's Changed
- quick-start: use anon for GCP by @dmpetrov in #751
- add code to readme by @dmpetrov in #750
- Return url for file in GCP public bucket instead of error by @dreadatour in #754
Full Changelog: 0.8.1...0.8.2
0.8.1
What's Changed
- fix(listing): pick exact or create new one on update by @shcheklein in #726
- fix(session): keep cached successful listings by @shcheklein in #728
- Remove duplication in 'FileType' type definition by @dreadatour in #731
- fix(dc): move parse uri and rename out of main class by @shcheklein in #732
- Fix Mistral example in quick-start by @dmpetrov in #735
- build(deps): bump mypy from 1.13.0 to 1.14.0 by @dependabot in #737
- build(deps): bump astral-sh/setup-uv from 4 to 5 by @dependabot in #736
- build(deps): bump ultralytics from 8.3.50 to 8.3.53 by @dependabot in #738
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #733
- quick-start: A small note about transformers install by @dmpetrov in #741
- from_parquet: use virtual filesystem to preserve partition information when using cache by @skshetry in #745
- Return url for file in GCP public bucket instead of error by @dreadatour in #748
- Optimize UDF with parallel execution by @dreadatour in #713
Full Changelog: 0.8.0...0.8.1
0.8.0
What's Changed
- Adding
DataChain.diff(...)
by @ilongin in #666 - Implement cancel job operation for CLI by @amritghimire in #695
- Making
cp
as non default when pulling dataset by @ilongin in #720 - Add log streaming to the created job by @amritghimire in #680
- Refactor pulling dataset rows by @ilongin in #617
- Added
DataChain.diff()
by @ilongin in #718 - cleanup a few from_json methods from DC by @shcheklein in #727
Full Changelog: 0.7.11...0.8.0
0.7.11
What's Changed
- Allow adding new signal from existing signal field by @dreadatour in #658
- Implement 'seed' for 'train_test_split' (take two) by @dreadatour in #678
- cli: improve startup time by @skshetry in #693
- Fix a typo in
docs/examples.md
by @LeviLovie in #675
New Contributors
- @LeviLovie made their first contribution in #675
Full Changelog: 0.7.10...0.7.11
0.7.10
What's Changed
- add some missing type hint by @shibuiwilliam in #648
- update readme - lower use cases to the ground by @shcheklein in #668
- Add 'bit_hamming_distance' and 'byte_hamming_distance' SQLite functions by @dreadatour in #669
- feat: update documentation and enhance navigation structure by @yathomasi in #670
- Allow testing external contributions using secrets by @0x2b3bfa0 in #667
- Fix readme by @dreadatour in #674
New Contributors
- @shibuiwilliam made their first contribution in #648
Full Changelog: 0.7.9...0.7.10
0.7.9
What's Changed
- Migrate the datasets and job urls by @amritghimire in #652
- pytorch dataset: pass cache/prefetch to DataChain instances by @skshetry in #653
- Use 'setup' in HuggingFace llm example by @dreadatour in #662
- Bump ultralytics from 8.3.29 to 8.3.37 by @dependabot in #626
to_pytorch
: enable prefetching by @skshetry in #664- order datasets and versions in list_datasets query by @mattseddon in #665
Full Changelog: 0.7.8...0.7.9
0.7.8
What's Changed
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #654
- Update base Func class and tests by @dreadatour in #641
Full Changelog: 0.7.7...0.7.8
0.7.7
What's Changed
- Remove 'partition_by' requirement from 'group_by' method by @dreadatour in #649
- Support 'case' operator inside functions by @dreadatour in #651
- Add test for 'group_dy' without 'partition_by' by @dreadatour in #650
Full Changelog: 0.7.6...0.7.7
0.7.6
What's Changed
- Using version
uuid
indatachain pull
by @ilongin in #621 - Add option to run the datachain query to Studio by @amritghimire in #579
- Refactor the cli commands for the datasets by @amritghimire in #622
- remove catalog.get_file_from_row by @mattseddon in #645
- Removing id generator by @ilongin in #594
Full Changelog: 0.7.5...0.7.6