Skip to content
This repository has been archived by the owner on Aug 1, 2024. It is now read-only.

Add basic support for Apple Silicon #852

Conversation

bradenmacdonald
Copy link

@bradenmacdonald bradenmacdonald commented Nov 3, 2021

This PR changes the Makefile to automatically detect if the developer is using an ARM system like Apple's M1 notebooks. It will then adjust the docker-compose file so that devstack can run.

Without this PR, attempting to run devstack on an ARM mac will display the error no matching manifest for linux/arm64/v8 in the manifest list entries

Does it work?

Yes but slowly.

What I have confirmed works:

  • Provisioning the default set of services
  • Basic usage of the LMS. It seems to work perfectly fine, just quite a bit more slowly than I'm used to.

I am still in the process of testing this out and setting up my devstack so I will update this with more information.

How it works

  • Any services whose docker image already includes ARM support are unchanged (e.g. the MFE node containers, mongo, memcached)
  • Any services whose current docker image doesn't support ARM but where there is a compatible image that does support ARM are changed to use the compatible image (e.g. mysql:5.7mariadb:10.4, redis:2.8redis:5)
  • The rest of the containers/services use custom images published by edxops that currently don't support ARM. These ones are just configured to run as x86_64 (amd64) images with qemu emulation. This works fine but is slow.
    • As/if more developers get apple silicon, we should find a way to build multi-arch images for these which will solve the performance problem.

How can I test this?

You need to use either an M1 Mac, an ARM linux system like a Raspberry Pi, or an ARM virtual machine (e.g. use UTM to run arm64 linux on an Intel Mac).

@openedx-webhooks
Copy link

Thanks for the pull request, @bradenmacdonald! I've created OSPR-6205 to keep track of it in JIRA, where we prioritize reviews. Please note that it may take us up to several weeks or months to complete a review and merge your PR.

Feel free to add as much of the following information to the ticket as you can:

  • supporting documentation
  • Open edX discussion forum threads
  • timeline information ("this must be merged by XX date", and why that is)
  • partner information ("this is a course on edx.org")
  • any other information that can help Product understand the context for the PR

All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here.

Please let us know once your PR is ready for our review and all tests are green.

@openedx-webhooks openedx-webhooks added needs triage open-source-contribution PR author is not from Axim or 2U labels Nov 3, 2021
@regisb
Copy link
Contributor

regisb commented Nov 4, 2021

As mentioned on the Open edX forum we are also affected by this issue in Tutor: overhangio/tutor#510

I'm curious to learn what solution will be found for the devstack. In particular, switching to Mariadb seems like a pretty radical choice, given that incompatibilies exist:

https://mariadb.com/kb/en/mariadb-vs-mysql-compatibility/
https://mariadb.com/kb/en/incompatibilities-and-feature-differences-between-mariadb-103-and-mysql-57/

(full disclaimer: I'm terrible at understanding mysql but I find this list of incompatibilities long and scary)

@natabene
Copy link

natabene commented Nov 4, 2021

@bradenmacdonald Thank you for your contribution. Is this ready for our review?

@openedx-webhooks openedx-webhooks added waiting on author PR author needs to resolve review requests, answer questions, fix tests, etc. and removed needs triage labels Nov 4, 2021
@bradenmacdonald
Copy link
Author

In particular, switching to Mariadb seems like a pretty radical choice, given that incompatibilies exist

Just to be clear, this is only for development, and only for people with ARM systems, and I consider it a temporary solution even for that. I'll certainly update this if I encounter any issues.

@bradenmacdonald
Copy link
Author

@natabene It's ready for review yes. I expect there may be some discussion about it, this may or may not be the best direction to go in. But as far as the code goes, I think this is complete.

@openedx-webhooks openedx-webhooks added awaiting prioritization and removed waiting on author PR author needs to resolve review requests, answer questions, fix tests, etc. labels Nov 9, 2021
@bradenmacdonald
Copy link
Author

Today I tried building ARM64 images of edxapp but ran into an issue when first trying to build the focal-common base image for ARM64.

cd ..
git clone https://github.com/edx/configuration.git
cd configuration
docker build --platform linux/arm64/v8 -f docker/build/focal-common/Dockerfile --tag edxops/focal-common:latest .

Somehow the task Download digicert intermediate Certificate is failing with a segfault.

<127.0.0.1> EXEC /bin/sh -c '/usr/bin/python /root/.ansible/tmp/ansible-tmp-1636564730.5833266-53209-155692291711061/AnsiballZ_get_url.py && sleep 0'
fatal: [127.0.0.1]: FAILED! => {
    "changed": false,
    "module_stderr": "Segmentation fault\n",
    "module_stdout": "",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 139
}

Some debugging points to the error is occurring in fetch_url in https://github.com/ansible/ansible/blob/stable-2.8/lib/ansible/module_utils/urls.py


mysql57:
#image: mysql:5.7 doesn't support linux/arm64/v8
image: mariadb:10.4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you are using platform: linux/amd64 in other places in this PR. Have you tried doing that with mysql57, if so, how did it go?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't try it, so I'm not sure. Based on my experience with the other images, I suspect it would work fine, but all the database queries would be a bit slower, degrading overall performance.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, the MySQL 8 images published by Oracle support arm64/v8: https://hub.docker.com/r/mysql/mysql-server/tags

Copy link
Contributor

@jinder1s jinder1s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thank you for creating this PR. From first glace, it mostly looks good. I added few question in review.

Also, would you either/or modify the README with a note about this or write an ADR about this addition.

Comment on lines +39 to +41
redis:
# image: redis:2.8 is ancient and doesn't support linux/arm64/v8
image: redis:5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much were you able to test this change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't yet done extensive tests. Only basic things like logging into the LMS, using the django admin, and running parts of the test suite. However, so far I haven't encountered any issues.

Comment on lines +82 to +85
OS_ARCH := $(shell uname -p)
ifeq ($(OS_ARCH),arm)
COMPOSE_FILE := $(COMPOSE_FILE):docker-compose-arm64v8.yml
endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this should be opt-in. I can't think of why someone with apple silicon would not want to do this, but putting this comment here in case someone reading this can think of a reason.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the result is pretty slow. But it works, and it's definitely better than not having a devstack at all on M1 Mac.

I am dissatisfied with the performance though, which is why I've now tried building a native ARM image for edxapp instead. If I could get that working, I'd definitely change the PR to use that approach instead. I hit a roadblock, which I think is due to the old version of ansible being used.

@bradenmacdonald
Copy link
Author

Also, would you either/or modify the README with a note about this or write an ADR about this addition.

Sure, I just want to spend a bit more time testing this first, and seeing if I can explore an alternative approach.

xqueue_consumer:
platform: linux/amd64

# edX Microfrontends all use node:12 which has linux/arm64/v8 builds.
Copy link

@msegado msegado Nov 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the MFEs have an issue too; optipng-bin fails to compile on M1 (see imagemin/optipng-bin#117), which then breaks the build since it's called later in the pipeline.

As suggested in the issue above, one can work around this for now by setting CPPFLAGS=-DPNG_ARM_NEON_OPT=0. I can confirm that this works if added as an environment variable to the MFE services.

@jinder1s
Copy link
Contributor

@bradenmacdonald Thanks again for doing this work. For now, it looks like this is still in-progress. Please ping me when you'd like further review.

@openedx-webhooks openedx-webhooks added waiting on author PR author needs to resolve review requests, answer questions, fix tests, etc. and removed engineering review labels Dec 3, 2021
@bradenmacdonald
Copy link
Author

I'm going to close this PR, and recommend that anyone who wants to develop on an ARM64 device use Tutor devstack instead. See discussion at overhangio/tutor#510 (comment)

Rationale:

  • The devstack experience under qemu emulation (which this PR provides) is not great - it's very slow.
  • Tutor allows building and running the image natively on arm64, which is fast.
  • Debugging the edX ansible scripts to be able to build an arm64 image is painful. It runs ansible stuff for about an hour and then gives a segfault; inspecting the broken container took a lot of work (messing around with "AnsiballZ" internals), but ultimately didn't help me figure out the issue. Debugging the tutor build process is much easier, it builds way faster, and it generally just works.

@openedx-webhooks
Copy link

@bradenmacdonald Even though your pull request wasn’t merged, please take a moment to answer a two question survey so we can improve your experience in the future.

@openedx-webhooks openedx-webhooks added rejected and removed waiting on author PR author needs to resolve review requests, answer questions, fix tests, etc. labels Dec 13, 2021
@jmbowman jmbowman mentioned this pull request Apr 13, 2022
1 task
@johnnagro
Copy link
Contributor

@bradenmacdonald we are taking another look into M1 support in devstack. Preliminary support on this branch/PR if you're still interested #920

@bradenmacdonald
Copy link
Author

@johnnagro Thanks for the heads up. Does that branch run edx-platform in an arm64 container or under x86 emulation? I had been able to get the latter working but not the former.

@johnnagro
Copy link
Contributor

@bradenmacdonald it uses/creates arm containers. we are in the early stages of testing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
open-source-contribution PR author is not from Axim or 2U rejected
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants