Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ZEPPELIN-6121] Write a Dockerfile for python interpreter image build #4865

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

seung-00
Copy link
Contributor

@seung-00 seung-00 commented Oct 5, 2024

What is this PR for?

This PR adds a Dockerfile to build a Python interpreter container. I think it could be helpful to use distributed computing resources.

What type of PR is it?

Improvement

Todos

What is the Jira issue?

How should this be tested?

  • Build and run the docker container
  • Verify that the interpreter container communicates with the zeppelin server

Screenshots (if appropriate)

Questions:

  • Does the license files need to update? Y
  • Is there breaking changes for older versions? N
  • Does this needs documentation? Y

@seung-00 seung-00 changed the title feat: Write a Dockerfile for python interpreter image build [ZEPPELIN-6121] Write a Dockerfile for python interpreter image build Oct 6, 2024
RUN chmod +x ./mvnw

RUN ./mvnw clean package -am -pl zeppelin-interpreter-shaded,zeppelin-interpreter,python -DskipTests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if it might be better to ensure that %python, %python.ipython, and %python.sql are all supported by default.
My thinking comes from the fact that these interpreters are listed in the overview section of the Zeppelin Python interpreter documentation, and IPython is also recommended there.

For IPython, it seems that adding the following command here can be useful for installing the necessary packages:

Suggested change
RUN pip install jupyter-client grpcio protobuf~=3.20 ipython ipykernel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tbonelee
Thank you for your comment. I have applied your suggestion and agree with you. That said, there are additional libraries, like pandas, that wouldn't be installed. In my opinion, we need an alternative solution to inject these libraries without having to rewrite/deploy the Dockerfile. I think it'd be better to address this in a separate task.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your point.
I believe that handling additional packages, like pandas, would require a more flexible approach. Such as using a conda-like package management system. It might be better to address this in a separate PR later.

@seung-00 seung-00 requested a review from tbonelee October 9, 2024 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants