Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#3515] feat(flink-connector): Support flink iceberg catalog #5914

Merged
merged 2 commits into from
Jan 16, 2025

Conversation

sunxiaojian
Copy link
Contributor

What changes were proposed in this pull request?

Support flink iceberg catalog

Why are the changes needed?

Fix: #3515

Does this PR introduce any user-facing change?

no

How was this patch tested?

FlinkIcebergCatalogIT
FlinkIcebergHiveCatalogIT

@sunxiaojian sunxiaojian changed the title [#3515]feat(flink-connector)Support flink iceberg catalog [#3515] feat(flink-connector): Support flink iceberg catalog Dec 19, 2024
@sunxiaojian sunxiaojian force-pushed the support-flink-iceberg-catalog branch 4 times, most recently from 24cd6b8 to ef294ba Compare December 19, 2024 07:37
@sunxiaojian
Copy link
Contributor Author

@FANNG1 @coolderli PTAL

@FANNG1
Copy link
Contributor

FANNG1 commented Dec 19, 2024

Cool!, I'll review this PR, but may need some time, :)

@sunxiaojian
Copy link
Contributor Author

Cool!, I'll review this PR, but may need some time, :)

ok, thanks

Comment on lines 8 to 10
The Apache Gravitino Flink connector offers the capability to read and write Iceberg tables, with the metadata managed by the Gravitino server. To enable the use of the Iceberg catalog within the Flink connector, you must download the Iceberg Flink runtime JAR to the Flink classpath.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The Apache Gravitino Flink connector offers the capability to read and write Iceberg tables, with the metadata managed by the Gravitino server. To enable the use of the Iceberg catalog within the Flink connector, you must download the Iceberg Flink runtime JAR to the Flink classpath.
The Apache Gravitino Flink connector can be used to read and write Iceberg tables, with the metadata managed by the Gravitino server.
To enable the Flink connector, you must download the Iceberg Flink runtime JAR and place it in the Flink classpath.

- `INSERT INTO & OVERWRITE`
- `SELECT`

#### Not supported operations:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Not supported operations:
#### Operations Not Supported:


## Catalog properties

Gravitino Flink connector will transform the following property names defined in catalog properties to Flink Iceberg connector configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Gravitino Flink connector will transform the following property names defined in catalog properties to Flink Iceberg connector configuration.
The Gravitino Flink connector transforms the following properties in a catalog to Flink connector configuration.


### S3

You need to add s3 secret to the Flink configuration using `s3.access-key-id` and `s3.secret-access-key`. Additionally, download the [Iceberg AWS bundle](https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-aws-bundle) and place it in the classpath of Flink.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You need to add s3 secret to the Flink configuration using `s3.access-key-id` and `s3.secret-access-key`. Additionally, download the [Iceberg AWS bundle](https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-aws-bundle) and place it in the classpath of Flink.
You need to add an S3 secret to the Flink configuration using `s3.access-key-id` and `s3.secret-access-key`.
Additionally, you need to download the [Iceberg AWS bundle](https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-aws-bundle)
and place it in the Flink classpath.


### OSS

You need to add OSS secret key to the Flink configuration using `client.access-key-id` and `client.access-key-secret`. Additionally, download the [Aliyun OSS SDK](https://gosspublic.alicdn.com/sdks/java/aliyun_java_sdk_3.10.2.zip) and copy `aliyun-sdk-oss-3.10.2.jar`, `hamcrest-core-1.1.jar`, `jdom2-2.0.6.jar` in the classpath of Flink.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You need to add OSS secret key to the Flink configuration using `client.access-key-id` and `client.access-key-secret`. Additionally, download the [Aliyun OSS SDK](https://gosspublic.alicdn.com/sdks/java/aliyun_java_sdk_3.10.2.zip) and copy `aliyun-sdk-oss-3.10.2.jar`, `hamcrest-core-1.1.jar`, `jdom2-2.0.6.jar` in the classpath of Flink.
You need to add an OSS secret key to the Flink configuration using `client.access-key-id` and `client.access-key-secret`.
Additionally, you need download the [Aliyun OSS SDK](https://gosspublic.alicdn.com/sdks/java/aliyun_java_sdk_3.10.2.zip),
and copy `aliyun-sdk-oss-3.10.2.jar`, `hamcrest-core-1.1.jar`, `jdom2-2.0.6.jar` to the Flink classpath.


### GCS

No extra configuration is needed. Please make sure the credential file is accessible by Flink, like using `export GOOGLE_APPLICATION_CREDENTIALS=/xx/application_default_credentials.json`, and download [Iceberg GCP bundle](https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-gcp-bundle) and place it to the classpath of Flink.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
No extra configuration is needed. Please make sure the credential file is accessible by Flink, like using `export GOOGLE_APPLICATION_CREDENTIALS=/xx/application_default_credentials.json`, and download [Iceberg GCP bundle](https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-gcp-bundle) and place it to the classpath of Flink.
No extra configuration is needed. Please make sure the credential file is accessible by Flink.
For example, `export GOOGLE_APPLICATION_CREDENTIALS=/xx/application_default_credentials.json`.
You need to download [Iceberg GCP bundle](https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-gcp-bundle) and place it in the Flink classpath.

@FANNG1
Copy link
Contributor

FANNG1 commented Dec 26, 2024

Hi, @sunxiaojian , Sorry for the delay, I'm working on the issues to release 0.8, may doesn't have enough time to review this PR these days.

@xunliu
Copy link
Member

xunliu commented Dec 28, 2024

hi @sunxiaojian Thank you for your contributions.
Can you send an email to me([email protected]), I have something need to discuss with you.

@sunxiaojian
Copy link
Contributor Author

Hi, @sunxiaojian , Sorry for the delay, I'm working on the issues to release 0.8, may doesn't have enough time to review this PR these days.

@FANNG1 ok , I will also handle the comments above as soon as possible

@sunxiaojian
Copy link
Contributor Author

sunxiaojian commented Dec 29, 2024

hi @sunxiaojian Thank you for your contributions. Can you send an email to me([email protected]), I have something need to discuss with you.

@xunliu The email has been sent

@sunxiaojian sunxiaojian force-pushed the support-flink-iceberg-catalog branch 4 times, most recently from fc59502 to 5acadb8 Compare January 4, 2025 03:33
@sunxiaojian sunxiaojian force-pushed the support-flink-iceberg-catalog branch 2 times, most recently from 22fadd3 to 9ec6317 Compare January 13, 2025 16:29
@FANNG1
Copy link
Contributor

FANNG1 commented Jan 14, 2025

LGTM except for minor comments, could you fix it?

@sunxiaojian sunxiaojian force-pushed the support-flink-iceberg-catalog branch 2 times, most recently from 92b8cd2 to 26d9ead Compare January 14, 2025 15:44
@sunxiaojian
Copy link
Contributor Author

LGTM except for minor comments, could you fix it?

@FANNG1 Fixed all,PTAL again, thanks

@FANNG1
Copy link
Contributor

FANNG1 commented Jan 15, 2025

@tengqm @coolderli any other comments?

@FANNG1 FANNG1 added the branch-0.8 Automatically cherry-pick commit to branch-0.8 label Jan 15, 2025
@sunxiaojian sunxiaojian force-pushed the support-flink-iceberg-catalog branch 2 times, most recently from f2f1891 to 97f3ff9 Compare January 15, 2025 11:41
@sunxiaojian
Copy link
Contributor Author

sunxiaojian commented Jan 15, 2025

After fixed conflict. This error seems to indicate that the Hive image cannot be accessed

@FANNG1
Copy link
Contributor

FANNG1 commented Jan 15, 2025

After fixed conflict. This error seems to indicate that the Hive image cannot be accessed

maybe incidental error, trigger rerun

@sunxiaojian sunxiaojian force-pushed the support-flink-iceberg-catalog branch 2 times, most recently from 989161b to 76138a5 Compare January 15, 2025 16:53
@FANNG1
Copy link
Contributor

FANNG1 commented Jan 16, 2025

After fixed conflict. This error seems to indicate that the Hive image cannot be accessed

maybe incidental error, trigger rerun

odd problem :(

@sunxiaojian sunxiaojian force-pushed the support-flink-iceberg-catalog branch from 76138a5 to f043f14 Compare January 16, 2025 03:29
@sunxiaojian
Copy link
Contributor Author

@FANNG1 fixed

@FANNG1 FANNG1 merged commit fd26b56 into apache:main Jan 16, 2025
25 checks passed
github-actions bot pushed a commit that referenced this pull request Jan 16, 2025
### What changes were proposed in this pull request?

Support flink iceberg catalog

### Why are the changes needed?

Fix: [#3515](#3515)

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
FlinkIcebergCatalogIT
FlinkIcebergHiveCatalogIT
@FANNG1
Copy link
Contributor

FANNG1 commented Jan 16, 2025

@sunxiaojian, merged to main and cherry pick to branch-0.8, thanks for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-0.8 Automatically cherry-pick commit to branch-0.8
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask] [flink-connector] Support iceberg catalog
5 participants