Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make otter-grader installable in JupyterLite #458

Closed
yuvipanda opened this issue Nov 18, 2021 · 24 comments
Closed

Make otter-grader installable in JupyterLite #458

yuvipanda opened this issue Nov 18, 2021 · 24 comments
Labels
enhancement New feature or request
Milestone

Comments

@yuvipanda
Copy link
Contributor

yuvipanda commented Nov 18, 2021

Is your feature request related to a problem? Please describe.

jupyterlite is a full blown scientific python environment running entirely in your browser, no server required! You can try a demo here.

If you try to install otter-grader with:

import micropip
await micropip.install('otter-grader')

It'll fails, trying to install the pypdf2 package. JupyterLite can only install pure python packages (other than the compiled packages it comes with, such as numpy, pandas, etc). So these packages will need to be made optional wherever possible.

Describe the solution you'd like

Figure out what are the required packages for our user functionality, and try making everything else optional.

Describe alternatives you've considered

  1. Don't support jupyterlite

Additional context

/cc @ericvd-ucb who really wants this, and @jptio the core dev of jupyterlite.

@yuvipanda yuvipanda added the enhancement New feature or request label Nov 18, 2021
@yuvipanda
Copy link
Contributor Author

@jptio I think nbconvert in particular is going to be very problematic. How do you think we can work around that?

@jtpio
Copy link

jtpio commented Nov 18, 2021

I haven't checked the code base yet, but is nbconvert a hard dependency?

Could otter-grader still be able to run without it, which means some functionalities might not be available? Or is it part of the core library and is then a hard-requirement?

Nevertheless it would be useful to be able to install nbconvert in jupyterlite / pyodide. Probably the pyzmq dependency could be made optional if we are only interested in the export.

@chrispyles
Copy link
Member

nbconvert isn't required for the basic functionality. It would be possible to refactor otter to disable certain features in environments without nbconvert installed.

@yuvipanda
Copy link
Contributor Author

@chrispyles that would be lovely!

@chrispyles chrispyles added this to the v4.0.0.b0 milestone Dec 10, 2021
@stale
Copy link

stale bot commented Feb 8, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Feb 8, 2022
@stale stale bot removed the wontfix This will not be worked on label Feb 8, 2022
@ericvd-ucb
Copy link
Contributor

Hi @chrispyles just wanted to bump this issue one more time. We were just chatting how cool it would be to demo Data 8 at national workshop in June.

@chrispyles
Copy link
Member

@yuvipanda @jtpio I've removed the nbconvert issue in this branch but now when trying to install in jupyterlite I'm getting an issue with click which I suspect is being caused by a dependency because Otter doesn't pin click to a specific version. Any ideas on how to resolve this?

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [5], in <cell line: 1>()
----> 1 await micropip.install("https://raw.githubusercontent.com/chrispyles/example-workshop-2/main/otter_grader-3.2.1-py3-none-any.whl", keep_going=True)

File /lib/python3.10/asyncio/futures.py:284, in Future.__await__(self)
    282 if not self.done():
    283     self._asyncio_future_blocking = True
--> 284     yield self  # This tells Task to wait for completion.
    285 if not self.done():
    286     raise RuntimeError("await wasn't used with future")

File /lib/python3.10/asyncio/tasks.py:304, in Task.__wakeup(self, future)
    302 def __wakeup(self, future):
    303     try:
--> 304         future.result()
    305     except BaseException as exc:
    306         # This may also be a cancellation.
    307         self.__step(exc)

File /lib/python3.10/asyncio/futures.py:201, in Future.result(self)
    199 self.__log_traceback = False
    200 if self._exception is not None:
--> 201     raise self._exception
    202 return self._result

File /lib/python3.10/asyncio/tasks.py:234, in Task.__step(***failed resolving arguments***)
    232         result = coro.send(None)
    233     else:
--> 234         result = coro.throw(exc)
    235 except StopIteration as exc:
    236     if self._must_cancel:
    237         # Task is cancelled right before coro stops.

File /lib/python3.10/site-packages/micropip/_micropip.py:183, in _PackageManager.install(self, requirements, ctx, keep_going)
    180     await install_func
    181     done_callback()
--> 183 transaction = await self.gather_requirements(requirements, ctx, keep_going)
    185 if transaction["failed"]:
    186     failed_requirements = ", ".join(
    187         [f"'{req}'" for req in transaction["failed"]]
    188     )

File /lib/python3.10/site-packages/micropip/_micropip.py:173, in _PackageManager.gather_requirements(self, requirements, ctx, keep_going)
    168 for requirement in requirements:
    169     requirement_promises.append(
    170         self.add_requirement(requirement, ctx, transaction)
    171     )
--> 173 await gather(*requirement_promises)
    174 return transaction

File /lib/python3.10/asyncio/futures.py:284, in Future.__await__(self)
    282 if not self.done():
    283     self._asyncio_future_blocking = True
--> 284     yield self  # This tells Task to wait for completion.
    285 if not self.done():
    286     raise RuntimeError("await wasn't used with future")

File /lib/python3.10/asyncio/tasks.py:304, in Task.__wakeup(self, future)
    302 def __wakeup(self, future):
    303     try:
--> 304         future.result()
    305     except BaseException as exc:
    306         # This may also be a cancellation.
    307         self.__step(exc)

File /lib/python3.10/asyncio/futures.py:201, in Future.result(self)
    199 self.__log_traceback = False
    200 if self._exception is not None:
--> 201     raise self._exception
    202 return self._result

File /lib/python3.10/asyncio/tasks.py:232, in Task.__step(***failed resolving arguments***)
    228 try:
    229     if exc is None:
    230         # We use the `send` method directly, because coroutines
    231         # don't have `__iter__` and `__next__` methods.
--> 232         result = coro.send(None)
    233     else:
    234         result = coro.throw(exc)

File /lib/python3.10/site-packages/micropip/_micropip.py:245, in _PackageManager.add_requirement(self, requirement, ctx, transaction)
    242     if not _is_pure_python_wheel(wheel["filename"]):
    243         raise ValueError(f"'{wheel['filename']}' is not a pure Python 3 wheel")
--> 245     await self.add_wheel(name, wheel, version, (), ctx, transaction)
    246     return
    247 else:

File /lib/python3.10/site-packages/micropip/_micropip.py:316, in _PackageManager.add_wheel(self, name, wheel, version, extras, ctx, transaction)
    314     dist = pkg_resources_distribution_for_wheel(zip_file, name, "???")
    315 for recurs_req in dist.requires(extras):
--> 316     await self.add_requirement(recurs_req, ctx, transaction)
    318 transaction["wheels"].append((name, wheel, version))

File /lib/python3.10/site-packages/micropip/_micropip.py:291, in _PackageManager.add_requirement(self, requirement, ctx, transaction)
    286         raise ValueError(
    287             f"Couldn't find a pure Python 3 wheel for '{req}'. "
    288             "You can use `micropip.install(..., keep_going=True)` to get a list of all packages with missing wheels."
    289         )
    290 else:
--> 291     await self.add_wheel(
    292         req.name, maybe_wheel, maybe_ver, req.extras, ctx, transaction
    293     )

File /lib/python3.10/site-packages/micropip/_micropip.py:316, in _PackageManager.add_wheel(self, name, wheel, version, extras, ctx, transaction)
    314     dist = pkg_resources_distribution_for_wheel(zip_file, name, "???")
    315 for recurs_req in dist.requires(extras):
--> 316     await self.add_requirement(recurs_req, ctx, transaction)
    318 transaction["wheels"].append((name, wheel, version))

File /lib/python3.10/site-packages/micropip/_micropip.py:276, in _PackageManager.add_requirement(self, requirement, ctx, transaction)
    274         return
    275     else:
--> 276         raise ValueError(
    277             f"Requested '{requirement}', "
    278             f"but {req.name}=={ver} is already installed"
    279         )
    280 metadata = await _get_pypi_json(req.name)
    281 maybe_wheel, maybe_ver = self.find_wheel(metadata, req)

ValueError: Requested 'click<=8.0.4', but click==8.1.2 is already installed

Here's the wheel I tried to install: https://raw.githubusercontent.com/chrispyles/example-workshop-2/main/otter_grader-3.2.1-py3-none-any.whl

@chrispyles
Copy link
Member

Also, the above error resulted from installing with keep_going=True. Before I set that I was getting a different error, which may also be important:

ValueError: Couldn't find a pure Python 3 wheel for 'pyzmq>=22.3'. You can use `micropip.install(..., keep_going=True)` to get a list of all packages with missing wheels.

@jtpio
Copy link

jtpio commented Apr 19, 2022

@chrispyles do you know which dependency brings pyzmq?

Maybe trying to explicitly install click before can help a bit:

import micropip

await micropip.install('click<=8.0.4')
await micropip.install('https://raw.githubusercontent.com/chrispyles/example-workshop-2/main/otter_grader-3.2.1-py3-none-any.whl', keep_going=True)

Getting the following when trying on https://jupyterlite.github.io/demo/lab/index.html:

image

@chrispyles
Copy link
Member

@jtpio I don't know which dependency requires pyzmq. Do you know of an easy way to find that out without hunting through the requirements files for each dependency?

@joelostblom
Copy link
Contributor

joelostblom commented Apr 22, 2022 via email

@chrispyles
Copy link
Member

According to pip-tree, it looks like jupyter-client is what requires pyzmq. I've moved that into requirements-test.txt so it's not being marked as part of Otter's install_requires and I was successfully able to install the wheel linked above after doing that.

However, when I tried to import otter, I'm getting an error being caused by importing dill:

---------------------------------------------------------------------------
UnsupportedOperation                      Traceback (most recent call last)
Input In [2], in <cell line: 1>()
----> 1 import otter

File /lib/python3.10/site-packages/otter/__init__.py:5, in <module>
      1 """Otter's Python API"""
      3 import platform
----> 5 from . import api
      6 from .check import logs
      7 from .check.notebook import Notebook

File /lib/python3.10/site-packages/otter/api.py:18, in <module>
     15     from .utils import nullcontext  # nullcontext is new in Python 3.7
     17 from .export import export_notebook
---> 18 from .run import main as run_grader
     21 def grade_submission(submission_path, ag_path="autograder.zip", quiet=False, debug=False):
     22     """
     23     Runs non-containerized grading on a single submission at ``submission_path`` using the autograder 
     24     configuration file at ``ag_path``. 
   (...)
     43             submission.
     44     """

File /lib/python3.10/site-packages/otter/run/__init__.py:5, in <module>
      3 import json
      4 import os
----> 5 import dill
      6 import shutil
      7 import tempfile

File /lib/python3.10/site-packages/dill/__init__.py:25, in <module>
     19 __doc__ = """
     20 """ + __doc__
     22 __license__ = """
     23 """ + __license__
---> 25 from ._dill import dump, dumps, load, loads, dump_session, load_session, \
     26     Pickler, Unpickler, register, copy, pickle, pickles, check, \
     27     HIGHEST_PROTOCOL, DEFAULT_PROTOCOL, PicklingError, UnpicklingError, \
     28     HANDLE_FMODE, CONTENTS_FMODE, FILE_FMODE
     29 from . import source, temp, detect
     31 # get global settings

File /lib/python3.10/site-packages/dill/_dill.py:211, in <module>
    209 FileType = get_file_type('rb', buffering=0)
    210 TextWrapperType = get_file_type('r', buffering=-1)
--> 211 BufferedRandomType = get_file_type('r+b', buffering=-1)
    212 BufferedReaderType = get_file_type('rb', buffering=-1)
    213 BufferedWriterType = get_file_type('wb', buffering=-1)

File /lib/python3.10/site-packages/dill/_dill.py:204, in get_file_type(*args, **kwargs)
    202 def get_file_type(*args, **kwargs):
    203     open = kwargs.pop("open", __builtin__.open)
--> 204     f = open(os.devnull, *args, **kwargs)
    205     t = type(f)
    206     f.close()

UnsupportedOperation: File or stream is not seekable.

dill is used by Otter's logging and checking mechanisms, so I'm not sure if it's possible to remove this dependency without losing critical functionality. Any ideas on how to resolve this?

@joelostblom
Copy link
Contributor

This might be because JupyterLite handles file I/O differently from JupyterLab, see e.g. jupyterlite/jupyterlite#119 and the issues linked there. It is possible that disabling logging or trying out this extension https://github.com/jupyterlab-contrib/jupyterlab-filesystem-access could help, but I am not sure about this so let's see if there are other suggestions.

@chrispyles
Copy link
Member

@jtpio ok I've managed to get otter to be importable by disabling logging but the next task is a bit more difficult. I'm working on adding a way to download test files on demand but I'm getting an SSLError when trying to use the requests library. I thought about using the fetch function but I think that wouldn't work since the code calling it is synchronous and hence can't use await. What are your thoughts on this?

---------------------------------------------------------------------------
SSLError                                  Traceback (most recent call last)
File /lib/python3.10/site-packages/urllib3/connectionpool.py:692, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    691 timeout_obj = self._get_timeout(timeout)
--> 692 conn = self._get_conn(timeout=pool_timeout)
    694 conn.timeout = timeout_obj.connect_timeout

File /lib/python3.10/site-packages/urllib3/connectionpool.py:281, in HTTPConnectionPool._get_conn(self, timeout)
    279         conn = None
--> 281 return conn or self._new_conn()

File /lib/python3.10/site-packages/urllib3/connectionpool.py:1009, in HTTPSConnectionPool._new_conn(self)
   1008 if not self.ConnectionCls or self.ConnectionCls is DummyConnection:
-> 1009     raise SSLError(
   1010         "Can't connect to HTTPS URL because the SSL module is not available."
   1011     )
   1013 actual_host = self.host

SSLError: Can't connect to HTTPS URL because the SSL module is not available.

During handling of the above exception, another exception occurred:

MaxRetryError                             Traceback (most recent call last)
File /lib/python3.10/site-packages/requests/adapters.py:440, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    439 if not chunked:
--> 440     resp = conn.urlopen(
    441         method=request.method,
    442         url=url,
    443         body=request.body,
    444         headers=request.headers,
    445         redirect=False,
    446         assert_same_host=False,
    447         preload_content=False,
    448         decode_content=False,
    449         retries=self.max_retries,
    450         timeout=timeout
    451     )
    453 # Send the request.
    454 else:

File /lib/python3.10/site-packages/urllib3/connectionpool.py:785, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    783     e = ProtocolError("Connection aborted.", e)
--> 785 retries = retries.increment(
    786     method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
    787 )
    788 retries.sleep()

File /lib/python3.10/site-packages/urllib3/util/retry.py:592, in Retry.increment(self, method, url, response, error, _pool, _stacktrace)
    591 if new_retry.is_exhausted():
--> 592     raise MaxRetryError(_pool, url, error or ResponseError(cause))
    594 log.debug("Incremented Retry for (url='%s'): %r", url, new_retry)

MaxRetryError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /chrispyles/example-workshop-2/main/tests/q1.py (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available."))

During handling of the above exception, another exception occurred:

SSLError                                  Traceback (most recent call last)
Input In [8], in <cell line: 1>()
----> 1 grader.check("q1")

File /lib/python3.10/site-packages/otter/check/utils.py:174, in logs_event.<locals>.event_logger(wrapped, self, args, kwargs)
    172 except Exception as e:
    173     self._log_event(event_type, success=False, error=e)
--> 174     raise e
    176 else:
    177     self._log_event(event_type, results=results, question=question, shelve_env=shelve_env)

File /lib/python3.10/site-packages/otter/check/utils.py:165, in logs_event.<locals>.event_logger(wrapped, self, args, kwargs)
    163 try:
    164     if event_type == EventType.CHECK:
--> 165         question, results, shelve_env = wrapped(*args, **kwargs)
    167     else:
    168         results = wrapped(*args, **kwargs)

File /lib/python3.10/site-packages/otter/check/notebook.py:202, in Notebook.check(self, question, global_env)
    189 """
    190 Runs tests for a specific question against a global environment. If no global environment
    191 is provided, the test is run against the calling frame's environment.
   (...)
    199     ``otter.test_files.abstract_test.TestFile``: the grade for the question
    200 """
    201 self._logger.info(f"Running check for question: {question}")
--> 202 test_path, test_name = resolve_test_info(
    203     self._path,
    204     self._resolve_nb_path(None, fail_silently=True),
    205     self._tests_url_prefix,
    206     question,
    207 )
    209 self._logger.debug(f"Resolved test path: {test_path}")
    210 self._logger.debug(f"Resolved test name: {test_name}")

File /lib/python3.10/site-packages/otter/check/utils.py:233, in resolve_test_info(tests_dir, nb_path, tests_url_prefix, question)
    231 if tests_url_prefix is not None:
    232     test_url = f"{tests_url_prefix}{'/' if not tests_url_prefix.endswith('/') else ''}{question}.py"
--> 233     res = requests.get(test_url)
    234     if not res.status == 200:
    235         raise ValueError(f"Unable to download test at {test_url}")

File /lib/python3.10/site-packages/requests/api.py:75, in get(url, params, **kwargs)
     64 def get(url, params=None, **kwargs):
     65     r"""Sends a GET request.
     66 
     67     :param url: URL for the new :class:`Request` object.
   (...)
     72     :rtype: requests.Response
     73     """
---> 75     return request('get', url, params=params, **kwargs)

File /lib/python3.10/site-packages/requests/api.py:61, in request(method, url, **kwargs)
     57 # By using the 'with' statement we are sure the session is closed, thus we
     58 # avoid leaving sockets open which can trigger a ResourceWarning in some
     59 # cases, and look like a memory leak in others.
     60 with sessions.Session() as session:
---> 61     return session.request(method=method, url=url, **kwargs)

File /lib/python3.10/site-packages/requests/sessions.py:529, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    524 send_kwargs = {
    525     'timeout': timeout,
    526     'allow_redirects': allow_redirects,
    527 }
    528 send_kwargs.update(settings)
--> 529 resp = self.send(prep, **send_kwargs)
    531 return resp

File /lib/python3.10/site-packages/requests/sessions.py:645, in Session.send(self, request, **kwargs)
    642 start = preferred_clock()
    644 # Send the request
--> 645 r = adapter.send(request, **kwargs)
    647 # Total elapsed time of the request (approximately)
    648 elapsed = preferred_clock() - start

File /lib/python3.10/site-packages/requests/adapters.py:517, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    513         raise ProxyError(e, request=request)
    515     if isinstance(e.reason, _SSLError):
    516         # This branch is for urllib3 v1.22 and later.
--> 517         raise SSLError(e, request=request)
    519     raise ConnectionError(e, request=request)
    521 except ClosedPoolError as e:

SSLError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /chrispyles/example-workshop-2/main/tests/q1.py (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available."))

@jtpio
Copy link

jtpio commented Apr 26, 2022

when trying to use the requests library. I thought about using the fetch function but I think that wouldn't work since the code calling it is synchronous and hence can't use await. What are your thoughts on this?

Ah unfortunately requests won't work (at least for now), and the recommended way is to indeed use fetch, or pyfetch.

Mind pointing at the code doing the request, to see if there is something to do about it? Thanks!

Also @chrispyles are you iterating in a fork of otter so make it work in JupyterLite? Curious to know how you are planning to have the same package work in both the regular and lite environments, or if you were planning to release a separate package for lite?

@chrispyles
Copy link
Member

OK, I'll take a look at those. Here's the code making the request.

My plan is to make the same package work in both environments. We've already got some features that are disabled/different for Colab, so I'm bootstrapping the system we use for telling whether a user is on Colab (by checking the interpreter returned by get_ipython) to determine whether the user is on JupyterLite. The code for this is here.

@chrispyles
Copy link
Member

Got it working with pyodide.open_url

@chrispyles
Copy link
Member

@jtpio is there a way to access the notebook file with something like open? For context, Otter supports storing tests in the notebook metadata so what I'd like to do is be able to open the notebook file so that I can read the test data from the notebook metadata. Currently just using open("example.ipynb") doesn't work.

@chrispyles
Copy link
Member

OK so I've just merged #489 which adds the fixes for Jupyerlite and some new features related to it. Going to leave this issue open for now though since I still need to add these changes to the documentation.

The new changes will be included in the 1st beta of v4 which should be out before the workshop in June.

@chrispyles chrispyles added the documentation Improvements or additions to documentation label May 29, 2022
@jtpio
Copy link

jtpio commented May 30, 2022

wow this sounds great @chrispyles!

is there a way to access the notebook file with something like open?

Not yet. Accessing content from Python notebook is still a bit clunky for now, but should hopefully be improved at some point (tracked in jupyterlite/jupyterlite#315).

@jtpio
Copy link

jtpio commented Jun 17, 2022

@chrispyles
Copy link
Member

Thanks for the update @jtpio. I've just merged in the docs updates + other related updates from #496 so we have an initial implementation of jupyterlite support ready for the 1st beta of otter v4. I'm going to leave this open though so that we can use it to track an update of the initial implementation to make use of the filesystem access you added.

@chrispyles chrispyles removed the documentation Improvements or additions to documentation label Jun 24, 2022
@jtpio
Copy link

jtpio commented Jun 24, 2022

OK sounds good!

@chrispyles
Copy link
Member

Since the initial impl is done closing this issue in favor of #511 (which will be used to track the tech debt reduction I discussed above) so that this issue can stay in the v4.0.0 milestone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants