Releases: apache/iceberg-python
pyiceberg-0.8.1
Full Changelog: pyiceberg-0.8.0...pyiceberg-0.8.1
Patch Release PR: #1384
What's Changed
The behavior of Table.name
is changed to return the table name without the catalog name. This is a broader effort to remove references to the catalog name in pyiceberg.
- Replace usage of
Table.identifier
withTable.name
which returns the table name without the catalog name - Replace the use of a deprecated function (
identifier_to_tuple_without_catalog
) in pyiceberg; remove unnecessary warnings
Documentation updates are included to reflect the updated process in https://py.iceberg.apache.org/
- Update βhow to releaseβ documentation
- 0.8.0 post-release steps
Bug fixes
- Fix
add_files
for parquet files without column stats - Allow leading underscore in column name used in row filter
- Ignore tables without table_type property from Glue and Hive
- Write
null
in manifest list metadata when there is no parent-snapshot-id
Remove upper bound restrictions for dependency libraries; allow early testing of new versions
- Remove Python library version upper bound restriction; allow Python 3.13
- Remove fsspec library version upper bound restriction
Commits
36 new commits since the 0.8.0
release.
12 new commits will be included in 0.8.1
- 11 commits cherry-picked as bug fixes (listed below)
- 1 commit to bump version to
0.8.1
11 bug fixes (cherry-picked)
acbd071 Write null
when there is no parent-snapshot-id (#1383)
bb078cf Add instruction for patch release (#1373)
ab43c6c fix KeyError
raised by add_files
when parquet file doe not have column stats (#1354)
cc1ab2c Improve documentation for "how to release" (#1359)
64dc6fe Remove Python 3.13 upper bound restriction (#1355)
d86ab6e Allow leading underscore in column name used in row filter (#1358)
7a4734e Replace reference of Table.identifier
with Table.name
(#1346)
a66ddc0 Ignore tables without table_type
from Glue and Hive (#1332)
2cbc77d Drop upper bounds for fsspec and it's implementations (#1341)
7660a5b 0.8.0 post release steps (#1334)
b2f0a9e use the non-deprecated func (#1326)
New Contributors
- @sumanth-manchala made their first contribution in #1341
- @gitzwz made their first contribution in #1332
- @vincenzon made their first contribution in #1358
- @bigluck made their first contribution in #1355
- @binayakd made their first contribution in #1354
pyiceberg-0.8.0
What's Changed
PR
- Update PyIceberg Verify Release doc by @chinmay-bhat in #976
- DOCS: Add Github Actions Screenshots to Release Notes by @sungwy in #975
- Bump up version in dev Dockerfile and Issue Template by @ndrluis in #981
- Fix pydantic warning in the commit process by @ndrluis in #972
- Bump up Iceberg version to 1.6.0 by @ndrluis in #982
- Bug Fix: use appropriate partition spec for delete by @sungwy in #984
- [Bug Fix]Use
self.table_metadata
when in transaction by @HonahX in #985 - DOCS: Add more post release notes by @sungwy in #983
- Treat warning as error in CI/Dev by @ndrluis in #973
- Use 'strtobool' instead of comparing with a string. by @ndrluis in #988
- Fix: accept empty arrays in struct field lookup by @grobgl in #997
- Add ndrluis as collaborator by @sungwy in #1009
- Fix list namespace response in rest catalog by @ndrluis in #995
- Pyarrow IO property for configuring large v small types on read by @sungwy in #986
- Update metadata-log for non-rest catalogs by @soumya-ghosh in #977
- Exclude Python 3.9.7 due to import error in catalog module by @ndrluis in #526
- Deprecate rest.authorization-url in favor of oauth2-server-uri by @ndrluis in #962
- Allow setting
write.parquet.row-group-limit
by @Fokko in #1016 - Deprecate Redundant Identifier Support in TableIdentifier, and row_filter by @sungwy in #994
- Fix: Handle Empty RecordBatch within
_task_to_record_batches
, fix correctness issue with positional deletes by @sungwy in #1026 - Fix overwrite when filtering all the data by @ndrluis in #1023
- Allow setting
write.parquet.page-row-limit
by @Fokko in #1017 - DOCS: Remove older row for
write.parquet.row-group-limit
by @sungwy in #1030 - Improve test_version_format() error message for version mismatches by @laksh-krishna-sharma in #1015
- Bump version to 0.7.1 by @sungwy in #1034
- Support s3.signer.endpoint for nessie by @guitcastro in #1029
- [bug] fix reading with
to_arrow_batch_reader
andlimit
by @kevinjqliu in #1042 - Use
VisitorWithPartner
for name-mapping by @Fokko in #1014 - Fix tracing existing entries when there are deletes by @Fokko in #1046
- Coverage Run unit tests first before docker containers are set up by @Minfante377 in #1055
- Update "verify release" instruction by @kevinjqliu in #1064
- Fix Install Issues with
docutils = 0.21.post1
and exclude 3.12 from supported python dependencies by @sungwy in #1067 - Post Release 0.7.1 version updates by @sungwy in #1073
- Update create table doc to clarify ID re-assignment by @paulcichonski in #1072
- Refactor PyArrow DataFiles Projection functions by @sungwy in #1043
- DOCS: Exclude signature files from twine upload by @sungwy in #1071
- Increase the minimal required pyarrow version to 14.0.0 by @ndrluis in #1090
- Fix
table_exists
behavior in REST catalog by @ndrluis in #1096 - fix: improve makefile by @TiansuYu in #1091
- fix (issue-1079): allow update_column to set doc as '' by @TiansuYu in #1083
- prevent adding duplicate files by @amitgilad3 in #1036
- Add list_views to rest catalog by @ndrluis in #817
- Emit warnings instead of failing when seeing unsupported configuration by @Fokko in #1111
- Use
markdownlint
instead ofmdformat
by @kevinjqliu in #1118 - Add drop_view to the rest catalog by @ndrluis in #820
- Support python 3.12 by @kevinjqliu in #1068
- Make
commit_table
public by @Fokko in #1112 - Refactoring: Break down very large
table/__init__.py
module by @sungwy in #1144 - fix: Invert
case_sensitive
logic in StructType by @AnthonyLam in #1147 - Bump
duckdb
to version1.1.0
by @kevinjqliu in #1149 - Deprecate ADLFS prefix in favor of ADLS by @ndrluis in #961
- Cache Manifest files by @chinmay-bhat in #787
- Use the correct spec when rewiting existing manifests by @Fokko in #1157
- Bug Fix: Use historical partition field name by @sungwy in #1161
- fix: remove old, incorrect docstring by @dataders in #1166
- Preserve Backward compatibility in 0.8.0 for #1144 by @sungwy in #1151
- follow up for more cleanup by @dataders in #1168
- [bug] [REST] Dont remove identifier root by @kevinjqliu in #1172
- fix: support MonthTransform for partitioning by @felixscherz in #1176
- Add metadata tables for
data_files
anddelete_files
by @soumya-ghosh in #1066 - Use ArrowScan.to_table to replace project_table by @JE-Chen in #1180
- Add Docstrings to
pyiceberg/table/__init__.py
by @sungwy in #1189 - Support python 3.12 in poetry by @kevinjqliu in #1192
- Use
cachetools's LRUCache
to cache manifest list by @kevinjqliu in #1187 - HA HMS support by @awdavidson in #752
- Bug Fix: Position Deletes + row_filter yields less data when the DataFile is large by @sungwy in #1141
- Remove dead loom link by @kevinjqliu in #1213
- Drop support for Python 3.8 by @raulcd in #1221
- Add clarifying docs to transform result types by @kevinzwang in #1211
- Add flag to allow disabling creation of catalog tables by @isc-patrick in #1155
- Bug Fix: Glue and Hive catalog return only Iceberg tables by @mark-major in #1145
- Move snapshot history expire table properties to constants by @ndrluis in #1217
- abort the whole table transaction if any updates in the transaction has failed by @stevie9868 in #1246
- PyArrow: Pass in null-mask by @Fokko in #1264
- Bump PyArrow to 18.0.0 by @Fokko in #1256
- Remove numpy as a hard dependency by @Fokko in #1270
- Allow for missing operation by @Fokko in #1263
- fix: list_tables method in glue catalog now only return tables. by @omkenge in #1258
- Replace
numpy
usage and remove frompyproject.toml
by @kevinjqliu in #1272 - Bump version to 0.8.0 by @Fokko in #1276
- Remove
initial_change
when CreateTableTransaction apply table updates on an empty metadata by @HonahX in #1219 - Deprecate for 0.8.0 release by @kevinjqliu in #1269
- Pass table-token to commit endpoint by @Fokko in #1278
- Updating configuration docs by @Samreay in #1292
- Allow union of
{int,long}
,{float,double}
, etc by @Fokko in #1283 - Allow passing in ARN Role and Session name to the
PyArrowFileIO
by @Fokko in #1...
pyiceberg-0.8.0rc2
What's Changed
PR
- Update PyIceberg Verify Release doc by @chinmay-bhat in #976
- DOCS: Add Github Actions Screenshots to Release Notes by @sungwy in #975
- Bump up version in dev Dockerfile and Issue Template by @ndrluis in #981
- Fix pydantic warning in the commit process by @ndrluis in #972
- Bump up Iceberg version to 1.6.0 by @ndrluis in #982
- Bug Fix: use appropriate partition spec for delete by @sungwy in #984
- [Bug Fix]Use
self.table_metadata
when in transaction by @HonahX in #985 - DOCS: Add more post release notes by @sungwy in #983
- Treat warning as error in CI/Dev by @ndrluis in #973
- Use 'strtobool' instead of comparing with a string. by @ndrluis in #988
- Fix: accept empty arrays in struct field lookup by @grobgl in #997
- Add ndrluis as collaborator by @sungwy in #1009
- Fix list namespace response in rest catalog by @ndrluis in #995
- Pyarrow IO property for configuring large v small types on read by @sungwy in #986
- Update metadata-log for non-rest catalogs by @soumya-ghosh in #977
- Exclude Python 3.9.7 due to import error in catalog module by @ndrluis in #526
- Deprecate rest.authorization-url in favor of oauth2-server-uri by @ndrluis in #962
- Allow setting
write.parquet.row-group-limit
by @Fokko in #1016 - Deprecate Redundant Identifier Support in TableIdentifier, and row_filter by @sungwy in #994
- Fix: Handle Empty RecordBatch within
_task_to_record_batches
, fix correctness issue with positional deletes by @sungwy in #1026 - Fix overwrite when filtering all the data by @ndrluis in #1023
- Allow setting
write.parquet.page-row-limit
by @Fokko in #1017 - DOCS: Remove older row for
write.parquet.row-group-limit
by @sungwy in #1030 - Improve test_version_format() error message for version mismatches by @laksh-krishna-sharma in #1015
- Bump version to 0.7.1 by @sungwy in #1034
- Support s3.signer.endpoint for nessie by @guitcastro in #1029
- [bug] fix reading with
to_arrow_batch_reader
andlimit
by @kevinjqliu in #1042 - Use
VisitorWithPartner
for name-mapping by @Fokko in #1014 - Fix tracing existing entries when there are deletes by @Fokko in #1046
- Coverage Run unit tests first before docker containers are set up by @Minfante377 in #1055
- Update "verify release" instruction by @kevinjqliu in #1064
- Fix Install Issues with
docutils = 0.21.post1
and exclude 3.12 from supported python dependencies by @sungwy in #1067 - Post Release 0.7.1 version updates by @sungwy in #1073
- Update create table doc to clarify ID re-assignment by @paulcichonski in #1072
- Refactor PyArrow DataFiles Projection functions by @sungwy in #1043
- DOCS: Exclude signature files from twine upload by @sungwy in #1071
- Increase the minimal required pyarrow version to 14.0.0 by @ndrluis in #1090
- Fix
table_exists
behavior in REST catalog by @ndrluis in #1096 - fix: improve makefile by @TiansuYu in #1091
- fix (issue-1079): allow update_column to set doc as '' by @TiansuYu in #1083
- prevent adding duplicate files by @amitgilad3 in #1036
- Add list_views to rest catalog by @ndrluis in #817
- Emit warnings instead of failing when seeing unsupported configuration by @Fokko in #1111
- Use
markdownlint
instead ofmdformat
by @kevinjqliu in #1118 - Add drop_view to the rest catalog by @ndrluis in #820
- Support python 3.12 by @kevinjqliu in #1068
- Make
commit_table
public by @Fokko in #1112 - Refactoring: Break down very large
table/__init__.py
module by @sungwy in #1144 - fix: Invert
case_sensitive
logic in StructType by @AnthonyLam in #1147 - Bump
duckdb
to version1.1.0
by @kevinjqliu in #1149 - Deprecate ADLFS prefix in favor of ADLS by @ndrluis in #961
- Cache Manifest files by @chinmay-bhat in #787
- Use the correct spec when rewiting existing manifests by @Fokko in #1157
- Bug Fix: Use historical partition field name by @sungwy in #1161
- fix: remove old, incorrect docstring by @dataders in #1166
- Preserve Backward compatibility in 0.8.0 for #1144 by @sungwy in #1151
- follow up for more cleanup by @dataders in #1168
- [bug] [REST] Dont remove identifier root by @kevinjqliu in #1172
- fix: support MonthTransform for partitioning by @felixscherz in #1176
- Add metadata tables for
data_files
anddelete_files
by @soumya-ghosh in #1066 - Use ArrowScan.to_table to replace project_table by @JE-Chen in #1180
- Add Docstrings to
pyiceberg/table/__init__.py
by @sungwy in #1189 - Support python 3.12 in poetry by @kevinjqliu in #1192
- Use
cachetools's LRUCache
to cache manifest list by @kevinjqliu in #1187 - HA HMS support by @awdavidson in #752
- Bug Fix: Position Deletes + row_filter yields less data when the DataFile is large by @sungwy in #1141
- Remove dead loom link by @kevinjqliu in #1213
- Drop support for Python 3.8 by @raulcd in #1221
- Add clarifying docs to transform result types by @kevinzwang in #1211
- Add flag to allow disabling creation of catalog tables by @isc-patrick in #1155
- Bug Fix: Glue and Hive catalog return only Iceberg tables by @mark-major in #1145
- Move snapshot history expire table properties to constants by @ndrluis in #1217
- abort the whole table transaction if any updates in the transaction has failed by @stevie9868 in #1246
- PyArrow: Pass in null-mask by @Fokko in #1264
- Bump PyArrow to 18.0.0 by @Fokko in #1256
- Remove numpy as a hard dependency by @Fokko in #1270
- Allow for missing operation by @Fokko in #1263
- fix: list_tables method in glue catalog now only return tables. by @omkenge in #1258
- Replace
numpy
usage and remove frompyproject.toml
by @kevinjqliu in #1272 - Bump version to 0.8.0 by @Fokko in #1276
- Remove
initial_change
when CreateTableTransaction apply table updates on an empty metadata by @HonahX in #1219 - Deprecate for 0.8.0 release by @kevinjqliu in #1269
- Pass table-token to commit endpoint by @Fokko in #1278
- Updating configuration docs by @Samreay in #1292
- Allow union of
{int,long}
,{float,double}
, etc by @Fokko in #1283 - Allow passing in ARN Role and Session name to the
PyArrowFileIO
by @Fokko in #1...
pyiceberg-0.8.0-rc1
What's Changed
PRs
- Update PyIceberg Verify Release doc by @chinmay-bhat in #976
- DOCS: Add Github Actions Screenshots to Release Notes by @sungwy in #975
- Bump up version in dev Dockerfile and Issue Template by @ndrluis in #981
- Fix pydantic warning in the commit process by @ndrluis in #972
- Bump up Iceberg version to 1.6.0 by @ndrluis in #982
- Bug Fix: use appropriate partition spec for delete by @sungwy in #984
- [Bug Fix]Use
self.table_metadata
when in transaction by @HonahX in #985 - DOCS: Add more post release notes by @sungwy in #983
- Treat warning as error in CI/Dev by @ndrluis in #973
- Use 'strtobool' instead of comparing with a string. by @ndrluis in #988
- Fix: accept empty arrays in struct field lookup by @grobgl in #997
- Add ndrluis as collaborator by @sungwy in #1009
- Fix list namespace response in rest catalog by @ndrluis in #995
- Pyarrow IO property for configuring large v small types on read by @sungwy in #986
- Update metadata-log for non-rest catalogs by @soumya-ghosh in #977
- Exclude Python 3.9.7 due to import error in catalog module by @ndrluis in #526
- Deprecate rest.authorization-url in favor of oauth2-server-uri by @ndrluis in #962
- Allow setting
write.parquet.row-group-limit
by @Fokko in #1016 - Deprecate Redundant Identifier Support in TableIdentifier, and row_filter by @sungwy in #994
- Fix: Handle Empty RecordBatch within
_task_to_record_batches
, fix correctness issue with positional deletes by @sungwy in #1026 - Fix overwrite when filtering all the data by @ndrluis in #1023
- Allow setting
write.parquet.page-row-limit
by @Fokko in #1017 - DOCS: Remove older row for
write.parquet.row-group-limit
by @sungwy in #1030 - Improve test_version_format() error message for version mismatches by @laksh-krishna-sharma in #1015
- Bump version to 0.7.1 by @sungwy in #1034
- Support s3.signer.endpoint for nessie by @guitcastro in #1029
- [bug] fix reading with
to_arrow_batch_reader
andlimit
by @kevinjqliu in #1042 - Use
VisitorWithPartner
for name-mapping by @Fokko in #1014 - Fix tracing existing entries when there are deletes by @Fokko in #1046
- Coverage Run unit tests first before docker containers are set up by @Minfante377 in #1055
- Update "verify release" instruction by @kevinjqliu in #1064
- Fix Install Issues with
docutils = 0.21.post1
and exclude 3.12 from supported python dependencies by @sungwy in #1067 - Post Release 0.7.1 version updates by @sungwy in #1073
- Update create table doc to clarify ID re-assignment by @paulcichonski in #1072
- Refactor PyArrow DataFiles Projection functions by @sungwy in #1043
- DOCS: Exclude signature files from twine upload by @sungwy in #1071
- Increase the minimal required pyarrow version to 14.0.0 by @ndrluis in #1090
- Fix
table_exists
behavior in REST catalog by @ndrluis in #1096 - fix: improve makefile by @TiansuYu in #1091
- fix (issue-1079): allow update_column to set doc as '' by @TiansuYu in #1083
- prevent adding duplicate files by @amitgilad3 in #1036
- Add list_views to rest catalog by @ndrluis in #817
- Emit warnings instead of failing when seeing unsupported configuration by @Fokko in #1111
- Use
markdownlint
instead ofmdformat
by @kevinjqliu in #1118 - Add drop_view to the rest catalog by @ndrluis in #820
- Support python 3.12 by @kevinjqliu in #1068
- Make
commit_table
public by @Fokko in #1112 - Refactoring: Break down very large
table/__init__.py
module by @sungwy in #1144 - fix: Invert
case_sensitive
logic in StructType by @AnthonyLam in #1147 - Bump
duckdb
to version1.1.0
by @kevinjqliu in #1149 - Deprecate ADLFS prefix in favor of ADLS by @ndrluis in #961
- Cache Manifest files by @chinmay-bhat in #787
- Use the correct spec when rewiting existing manifests by @Fokko in #1157
- Bug Fix: Use historical partition field name by @sungwy in #1161
- fix: remove old, incorrect docstring by @dataders in #1166
- Preserve Backward compatibility in 0.8.0 for #1144 by @sungwy in #1151
- follow up for more cleanup by @dataders in #1168
- [bug] [REST] Dont remove identifier root by @kevinjqliu in #1172
- fix: support MonthTransform for partitioning by @felixscherz in #1176
- Add metadata tables for
data_files
anddelete_files
by @soumya-ghosh in #1066 - Use ArrowScan.to_table to replace project_table by @JE-Chen in #1180
- Add Docstrings to
pyiceberg/table/__init__.py
by @sungwy in #1189 - Support python 3.12 in poetry by @kevinjqliu in #1192
- Use
cachetools's LRUCache
to cache manifest list by @kevinjqliu in #1187 - HA HMS support by @awdavidson in #752
- Bug Fix: Position Deletes + row_filter yields less data when the DataFile is large by @sungwy in #1141
- Remove dead loom link by @kevinjqliu in #1213
- Drop support for Python 3.8 by @raulcd in #1221
- Add clarifying docs to transform result types by @kevinzwang in #1211
- Add flag to allow disabling creation of catalog tables by @isc-patrick in #1155
- Bug Fix: Glue and Hive catalog return only Iceberg tables by @mark-major in #1145
- Move snapshot history expire table properties to constants by @ndrluis in #1217
- abort the whole table transaction if any updates in the transaction has failed by @stevie9868 in #1246
- PyArrow: Pass in null-mask by @Fokko in #1264
- Bump PyArrow to 18.0.0 by @Fokko in #1256
- Remove numpy as a hard dependency by @Fokko in #1270
- Allow for missing operation by @Fokko in #1263
- fix: list_tables method in glue catalog now only return tables. by @omkenge in #1258
- Replace
numpy
usage and remove frompyproject.toml
by @kevinjqliu in #1272 - Bump version to 0.8.0 by @Fokko in #1276
- Remove
initial_change
when CreateTableTransaction apply table updates on an empty metadata by @HonahX in #1219 - Deprecate for 0.8.0 release by @kevinjqliu in #1269
- Pass table-token to commit endpoint by @Fokko in #1278
- Updating configuration docs by @Samreay in #1292
- Allow union of
{int,long}
,{float,double}
, etc by @Fokko in #1283 - Allow passing in ARN Role and Session name to the
PyArrowFileIO
by @Fokko in https://github.com/apache/iceberg-python/pull/...
pyiceberg-0.7.1
What's Changed
- Fix
delete
to trace existing manifests when a data file is partially rewritten by @Fokko in #1046 - Fix 'to_arrow_batch_reader' to respect the limit input arg by @kevinjqliu in #1042
- Fix correctness of applying positional deletes on Merge-On-Read tables by @sungwy in #1026
- Fix overwrite when filtering data by @ndrluis in #1023
- Bug fix for deletes across multiple partition specs on partition evolution by @sungwy in #984
- Fix evolving the table and writing in the same transaction by @HonahX in #985
- Fix scans when result is empty by @grobgl in #997
- Fix ListNamespace response in REST Catalog by @ndrluis in #995
- Exclude Python 3.9.7 from list of supported versions by @ndrluis in #526
- Allow setting write.parquet.row-group-limit by @Fokko in #1016
- Allow setting write.parquet.page-row-limit by @Fokko in #1017
- Fix pydantic warning during commit by @ndrluis in #972
Full Changelog: pyiceberg-0.7.0...pyiceberg-0.7.1
pyiceberg-0.7.0
What's Changed
- Build: Bump getdaft from 0.2.14 to 0.2.15 by @dependabot in #434
- Build: Bump cryptography from 42.0.0 to 42.0.2 by @dependabot in #440
- docs: Add missing release steps by @Fokko in #443
- Build: Bump moto from 5.0.1 to 5.0.2 by @dependabot in #447
- Build: Bump mkdocs-material from 9.5.9 to 9.5.10 by @dependabot in #448
- Make the snapshot creation part of the
Transaction
by @Fokko in #446 - Send X-Iceberg-Access-Delegation header to signal support for vended credentials/remote signing by @nastra in #436
- Retry with new Access Token on 419 response by @anupam-saini in #340
- Reuse commit-uuid as the write-uuid by @Fokko in #437
- Update NameMapping on update_schema() by @sungwy in #441
- Feat: Implement
create_table_if_not_exists
by @hussein-awala in #415 - Build: Bump coverage from 7.4.1 to 7.4.2 by @dependabot in #457
- Build: Bump getdaft from 0.2.15 to 0.2.16 by @dependabot in #456
- Accept pyarrow LargeListType and FixedSizeListType by @hussein-awala in #458
- Bump pre-commit and such by @Fokko in #442
- docstring: Fix missing commit by @Fokko in #432
- Improve error message in case of a mismatch by @Fokko in #352
- Cleanup conftest, remove LocalOutputFile by @kevinjqliu in #468
- Fix
InMemoryCatalog
Catalog commit operation by @anupam-saini in #470 - enable set hadoop ugi for hive catalog by @j7nhai in #472
- Raise exception if namespace does not exist in load_namespace_properties for Sql Catalog by @rushilshah1 in #477
- Add Support for Custom Header Configurations in RESTCatalog by @geruh in #467
- rest: Set OAuth Content-Type header explicitly by @Fokko in #478
- Partition Evolution by @amogh-jahagirdar in #245
- Fix retrying logic by @Fokko in #480
- Remove unused catalog from integration test by @kevinjqliu in #481
- add github add to check md link by @kevinjqliu in #324
- Sort Order update by @anupam-saini in #476
- Make issued_token_type optional to support OAuth2 Client Credential Flow by @flyrain in #466
- Update table metadata throughout transaction by @Fokko in #471
- Allow non-string typed values in table properties by @kevinjqliu in #469
- Construction of filenames for partitioned writes by @jqin61 in #453
- Remove extraneous import by @Fokko in #485
- Default spark session timezone to UTC in test by @kevinjqliu in #494
- Fix dead links in docs by @kevinjqliu in #493
- Update bug isse template release list by @ndrluis in #496
- add rest scope in the config documentation by @himadripal in #495
- Make scope configurable by @himadripal in #484
- Tests should explicitly check for
schema_id
by @kevinjqliu in #487 - add support for glue.id by @jrouly in #490
- [Bug Fix] cast None
current-snapshot-id
as -1 for Backwards Compatibility by @sungwy in #473 - Make optional oauth configurable by @himadripal in #486
- Disable Spark Catalog caching for integration tests by @kevinjqliu in #501
- Set table properties with dictionary by @kevinjqliu in #503
- Imports decouple by @ndrluis in #505
- Allow setting non-string typed values in
set_properties
by @kevinjqliu in #504 - [Bug fix] update name mapping in Transaction.update_schema by @sungwy in #508
- [Bug Fix] Allow Partition data to be nullable in ManifestEntry by @sungwy in #509
- Allow fsspec up to 2025.1 by @bolkedebruin in #510
- Build: Bump pypa/cibuildwheel from 2.16.5 to 2.17.0 by @dependabot in #517
- Decouple imports reported by mypy linter by @ndrluis in #519
- build: Move back to the mmh3 by @Fokko in #460
- Improve the InMemory Catalog Implementation by @kevinjqliu in #289
- Add
table_exists
method to Catalog by @anupam-saini in #512 - Add StrictMetricsEvaluator by @Fokko in #518
- Add Data Files from Parquet Files to UnPartitioned Table by @sungwy in #506
- Fix CommitTableRequest serialisation by @kdbhiggins in #525
- Add partition stats in snapshot summary by @jqin61 in #521
- UUID literal to binary and fixed by @sebpretzer in #529
- Adding a new dev dep,
deptry
by @kevinjqliu in #528 - Add as_arrow() to Schema class by @ndrluis in #532
- Change Append/Overwrite API to accept snapshot properties by @Gowthami03B in #419
- Fix Glue Integration test by @HonahX in #536
- Add Snapshots table metadata by @Fokko in #524
add_files
support partitioned tables by @sungwy in #531- [Bug Fix] Fix TableMetadataV1 Validators by @HonahX in #544
- Fix race condition on
Table.scan
withlimit
by @kevinjqliu in #545 - Add Strict projection by @Fokko in #539
- Fix the Avro tests by @Fokko in #552
- On write operation, cast data to Iceberg Table's pyarrow schema by @kevinjqliu in #523
- Bin-pack Writes Operation into multiple parquet files, and parallelize writing
WriteTask
s by @kevinjqliu in #444 - Bump version to 0.6.1 by @HonahX in #561
- Minor fixes, #523 followup by @kevinjqliu in #563
- Call as_arrow() call in
overwrite
by @kevinjqliu in #565 - Remove
as visitors
import by @Fokko in #567 - Tests: Make Spark optional for testing by @Fokko in #568
- [CI FIx] Use Docker Compose V2 by @HonahX in #575
- typealias for table version by @MehulBatra in #566
- Disallow default header to be overwritten by @whynick1 in #577
- [Doc] Update how-to-release.md by @HonahX in #576
- Support CreateTableTransaction in Glue and Rest by @HonahX in #498
- Move writes to Transaction by @sungwy in #571
- Add entries metadata table by @Fokko in #551
- Partitioned Append on Identity Transform by @jqin61 in #555
- Implement getstate and setstate on PyArrowFileIO and FsSpecFileIO so that they can be pickled by @amogh-jahagirdar in #543
- [Bug Fix] Allow HiveCatalog to create table with TimestamptzType by @HonahX in #585
- Change DataScan to accept Metadata and io by @Fokko in #581
- Read: fetch file_schema directly from pyarrow_to_schema by @HonahX in #597
- Support Time Travel in InspectTable.entries by @sungwy in #599
- [Bug Fix] HiveCatalog's _commit_table need to refresh and update the metadata in a ...
PyIceberg 0.6.1
Patch release:
- Fail to create version 1 table with non-empty partition-spec and sort-order
- Hive Catalog cannot create table with TimestamptzType field
- Fail to read parquet file with special characters in column names
- Hive Catalog commit consistency issue
- docutils=0.21 installation issue
Full Changelog: https://github.com/apache/iceberg-python/commits/pyiceberg-0.6.1
PyIceberg 0.6.0
What's Changed
- Python: Migrate from
iceberg
toiceberg-python
by @Fokko in #3 - Build: Bump duckdb from 0.8.1 to 0.9.0 by @dependabot in #4
- Build: Bump mkdocs-section-index from 0.3.7 to 0.3.8 by @dependabot in #5
- Build: Bump mkdocstrings-python from 1.7.0 to 1.7.1 by @dependabot in #6
- Build: Bump pydantic from 2.3.0 to 2.4.2 by @dependabot in #7
- Build: Bump psycopg2-binary from 2.9.7 to 2.9.8 by @dependabot in #8
- Build: Bump moto from 4.2.4 to 4.2.5 by @dependabot in #9
- Build: Bump mkdocs-material from 9.4.1 to 9.4.2 by @dependabot in #10
- Build: Bump rich from 13.5.3 to 13.6.0 by @dependabot in #11
- Build: Bump typing-extensions from 4.7.1 to 4.8.0 by @dependabot in #12
- Build: Bump griffe from 0.36.2 to 0.36.4 by @dependabot in #13
- Build: Bump urllib3 from 1.26.16 to 1.26.17 by @dependabot in #36
- Update how to release by @Fokko in #34
- pydantic exclude 2.4.0, 2.4.1 by @syun64 in #38
- Add logic to generate a new snapshot-id by @Fokko in #37
- Fix the TableIdentifier by @Fokko in #44
- Convert the Logical to Physical map to a visitor by @Fokko in #43
- Build: Bump mkdocstrings-python from 1.7.1 to 1.7.2 by @dependabot in #52
- Build: Bump fastavro from 1.8.3 to 1.8.4 by @dependabot in #51
- Build: Bump pypa/cibuildwheel from 2.16.0 to 2.16.2 by @dependabot in #47
- Build: Bump psycopg2-binary from 2.9.8 to 2.9.9 by @dependabot in #49
- Build: Bump coverage from 7.3.1 to 7.3.2 by @dependabot in #50
- Build: Bump cython from 3.0.2 to 3.0.3 by @dependabot in #48
- Docs: Fix repo name and url by @manuzhang in #54
- Run integration tests with Iceberg 1.4.0 by @Fokko in #56
- Add logic for table format-version updates by @Fokko in #55
- Disable merge-commit and enforce linear history by @Fokko in #57
- Construct a writer tree by @Fokko in #40
- Add method and property around sequence-numbers by @Fokko in #60
- Fix column rename doc example to reflect correct API by @cabhishek in #59
- Expression: Part of the expression is ignored when multiple and/or expressions are specified by @amogh-jahagirdar in #65
- Fix Iceberg to Avro Schema Conversion: Fixed, Decimal, UUID by @HonahX in #53
- allow override env-variables in load_catalog by @bdilday in #45
- Make
next_sequence_number
private by @Fokko in #62 - Check for empty responses by @Fokko in #69
- Fix Arrow fixed type by @Fokko in #70
- Bump version to 0.5.1 by @Fokko in #68
- Add
spec_id
back to data file by @puchengy in #63 - Build: Bump ray from 2.7.0 to 2.7.1 by @dependabot in #77
- Build: Bump griffe from 0.36.4 to 0.36.5 by @dependabot in #76
- Build: Bump mypy-boto3-glue from 1.28.36 to 1.28.63 by @dependabot in #75
- Build: Bump mkdocstrings-python from 1.7.2 to 1.7.3 by @dependabot in #74
- Build: Bump moto from 4.2.5 to 4.2.6 by @dependabot in #73
- Remove python working directory by @Fokko in #71
- Don't fail on warning when releasing by @Fokko in #80
- Remove
example
since it is deprecated by @Fokko in #79 - Build: Bump urllib3 from 1.26.17 to 1.26.18 by @dependabot in #84
- Doc: Fix "Verifying Checksums" script in verify-release.md by @HonahX in #82
- Make to_arrow function capable of handling parquet files with sanitized name due to Avro restirction by @puchengy in #83
- Require full expression parse match by @danielcweeks in #88
- Fix NotStartsWith negation by @danielcweeks in #92
- Fix some broken commands and URLs in the docs by @hussein-awala in #89
- Update like statements to reflect sql behaviors by @danielcweeks in #91
- Fix equality of bound expressions by @Fokko in #95
- Build: Bump mkdocs-material from 9.4.2 to 9.4.6 by @dependabot in #100
- Build: Bump pytest-mock from 3.11.1 to 3.12.0 by @dependabot in #99
- Build: Bump sqlalchemy from 2.0.21 to 2.0.22 by @dependabot in #98
- Build: Bump griffe from 0.36.5 to 0.36.7 by @dependabot in #97
- Build: Bump adlfs from 2023.9.0 to 2023.10.0 by @dependabot in #96
- Replace old
%-formatted
byf-strings
by @hussein-awala in #93 - Fix literal predicate equality check by @danielcweeks in #94
- Fix the nullability of
snapshot-id
onAssertRefSnapshotId
by @Fokko in #103 - Build: Bump werkzeug from 2.3.7 to 3.0.1 by @dependabot in #105
- Api docs refactor by @mobley-trent in #106
- Fixed typos by @whisk in #108
- Build: Bump duckdb from 0.9.0 to 0.9.1 by @dependabot in #114
- Build: Bump pre-commit from 3.4.0 to 3.5.0 by @dependabot in #113
- Build: Bump mkdocs-material from 9.4.6 to 9.4.7 by @dependabot in #111
- Build: Bump pytest from 7.4.2 to 7.4.3 by @dependabot in #112
- Build: Bump moto from 4.2.6 to 4.2.7 by @dependabot in #110
- fix: partition evaluator thread safety by @skellys in #115
- Run dependabot daily by @Fokko in #66
- Build: Bump griffe from 0.36.7 to 0.36.9 by @dependabot in #118
- Build: Bump cython from 3.0.3 to 3.0.5 by @dependabot in #122
- Build: Bump sqlalchemy from 2.0.22 to 2.0.23 by @dependabot in #125
- Build: Bump zstandard from 0.21.0 to 0.22.0 by @dependabot in #120
- Build: Bump fastavro from 1.8.4 to 1.9.0 by @dependabot in #119
- Refactor Arrow schema conversion by @Fokko in #117
- Build: Bump pyarrow from 13.0.0 to 14.0.0 by @Fokko in #126
- Build: Bump mkdocs-material-extensions from 1.2 to 1.3 by @dependabot in #128
- Add flake8-pie to ruff by @Fokko in #86
- Update pre-commit by @Fokko in #85
- Bump version to 0.6.0 by @Fokko in #72
- Build: Bump mypy-boto3-glue from 1.28.63 to 1.28.77 by @dependabot in #130
- Catch warning in PyLint tests by @Fokko in #33
- Build: Bump mkdocs-material from 9.4.7 to 9.4.8 by @dependabot in #131
- Fix Github Pages path by @Fokko in #133
- Build: Bump pyarrow from 14.0.0 to 14.0.1 by @dependabot in #136
- Add list-refs cli command by @amogh-jahagirdar in #137
- Docs: Add section on pandas by @Fokko in #138
- Build: Bump mkdocstrings-python from 1.7.3 to 1.7.4 by @dependabot in #142
...