Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial ppc64le Support #1057

Merged
merged 3 commits into from
Oct 1, 2023
Merged

Conversation

erichte-ibm
Copy link
Contributor

@erichte-ibm erichte-ibm commented Sep 18, 2020

Revive PR #819 (by q66) in reduced-scope by implementing powerpc64le support in ring. This effort supports only little endian machines using POWER8 and later processors for now. The goal is to enable ring to target modern powerpc64le machines.

Only the bare minimum needed to build ring and pass ring tests is included; there is no new optimized code for algorithms already implemented in C/Rust. This has been tested on a POWER9 Blackbird, and on a POWER8 machine.

@erichte-ibm erichte-ibm force-pushed the initial-ppc64le-support-pr branch from 05491ff to b1b0141 Compare January 22, 2021 20:22
@codecov
Copy link

codecov bot commented Jan 22, 2021

Codecov Report

Merging #1057 (5c496c5) into main (4b87b67) will decrease coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main    #1057      +/-   ##
==========================================
- Coverage   96.02%   96.01%   -0.01%     
==========================================
  Files         133      133              
  Lines       19913    19913              
  Branches      199      199              
==========================================
- Hits        19121    19120       -1     
- Misses        754      756       +2     
+ Partials       38       37       -1     
Files Coverage Δ
src/cpu/arm.rs 92.39% <ø> (ø)

... and 2 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@erichte-ibm erichte-ibm force-pushed the initial-ppc64le-support-pr branch 2 times, most recently from 60f7563 to 443ef67 Compare January 29, 2021 20:37
@erichte-ibm
Copy link
Contributor Author

Updated the branch on top of current main, fixed some issues reported by CI.

algitbot pushed a commit to alpinelinux/aports that referenced this pull request Mar 8, 2021
still won't build (ring) for ppc64le, current efforts at
briansmith/ring#1057
@erichte-ibm erichte-ibm force-pushed the initial-ppc64le-support-pr branch from 443ef67 to ba5b4b6 Compare April 2, 2021 19:45
@erichte-ibm
Copy link
Contributor Author

Updated the branch on top of current main. Removed algorithm-specific assembly to match commit 501fc4e , only the minimum required BN assembly from OpenSSL is included.

@erichte-ibm erichte-ibm force-pushed the initial-ppc64le-support-pr branch from ba5b4b6 to 226f6f8 Compare February 4, 2022 20:57
@erichte-ibm
Copy link
Contributor Author

Rebased branch on top of current main.

Please let us know if there is anything you need from us to aid you in moving this forward.

@erichte-ibm erichte-ibm force-pushed the initial-ppc64le-support-pr branch from 226f6f8 to f0ba073 Compare March 11, 2022 20:47
@ghost ghost mentioned this pull request Apr 29, 2022
@ghost
Copy link

ghost commented Apr 29, 2022

Did you ever hear from @briansmith?

@runlevel5
Copy link

@erichte-ibm @briansmith is there anything I could help to get this PR wrap up?

@erichte-ibm erichte-ibm force-pushed the initial-ppc64le-support-pr branch from f0ba073 to 0857717 Compare May 6, 2022 19:35
@erichte-ibm
Copy link
Contributor Author

Did you ever hear from @briansmith?

We are still waiting to hear back, though we have be actively rebasing every week or so to try to keep this PR up to date.

@erichte-ibm @briansmith is there anything I could help to get this PR wrap up?

Reviews and testing would be wonderful! So far, we have tested on little endian P8/P9/P10. We'd love to know your use case and platform so that we can work together on any issues you might find.

@ghost
Copy link

ghost commented May 6, 2022

We'd love to know your use case

I've added support for powerpc64le to nixpkgs.

Nixpkgs is sort of like Gentoo except that it isn't a distribution, and technically isn't even Linux-specific.
  • It can run without root or needing to take over the whole machine, on top of any other Linux distribution, many of the BSDs, and even (with a lesser degree of functionality) MacOS.
  • It natively supports installing multiple different versions of a package or library, with zero additional effort or complexity.
  • It uses a very simple and elegant lazy functional language instead of Python.
  • It keeps all generated files (compiled binaries, etc) in an immutable-once-written, garbage collected "heap in the filesystem" (/nix/store).

Gentoo can do the first two things, sort of, sometimes, with Gentoo Prefixes, but those only work for certain packages and are still sort of an experimental feature.

Credit where credit is due, about half of the necessary effort was already in-tree as unfinished work; I did the other half to push it across the finish line. It successfully bootstraps now (if you apply all of my PRs, a few are not yet merged), building a complete Linux userspace from scratch. Nix does an outstanding job of bringing out the parallelism in this build process, which really shows off the throughput on my high-core-count Talos2 machines (those slick eDRAM cells you use don't hurt either; too bad GF never offered those to the rest of us).

Unfortunately Rust's recursive hashing scheme (Cargo.lock) makes it really awkward to carry out-of-tree patches to a widely-depended-upon library like Ring.

You need to use the cargo [patch] stanza, but unfortunately this requires recalculating the entire Cargo.lock during builds, which requires network access to https://crates.io. Network access during builds is a total non-starter for us. Recalculating the Cargo.lock of every package in nixpkgs that uses ring is theoretically possible, but unlikely to be acceptable... if that sort of thing were allowed it would quickly snowball into an exponential number of package updates for each small change to widely-used libraries.

I would really like to see this merged, even if it's hidden behind an off-by-default "yes I am crazy and I know what I'm doing" feature flag. We would expose it via an off-by-default flag with a similarly-scary name to our own users.

@erichte-ibm
Copy link
Contributor Author

You need to use the cargo [patch] stanza, but unfortunately this requires recalculating the entire Cargo.lock during builds, which requires network access to https://crates.io.

Does adding a feature still require a [patch] to enable that feature for ring somewhere in the downstream dependency chain? (e.g. projects that depend on something like rustls and not ring directly)

I would really like to see this merged, even if it's hidden behind an off-by-default "yes I am crazy and I know what I'm doing" feature flag. We would expose it via an off-by-default flag with a similarly-scary name to our own users.

A feature would be somewhat redundant since we check the target_arch in most places, however it would make it more explicit that we wish to use a "less supported" platform.

@briansmith Would adding an explicit feature for POWER support help get this reviewed/merged?

@ghost
Copy link

ghost commented May 14, 2022

You need to use the cargo [patch] stanza, but unfortunately this requires recalculating the entire Cargo.lock during builds, which requires network access to https://crates.io.

Does adding a feature still require a [patch] to enable that feature for ring somewhere in the downstream dependency chain? (e.g. projects that depend on something like rustls and not ring directly)

I don't think features are covered by the Cargo.lock checksum field, but even if they are, a [patch] stanza will not be required, because feature flags are "unioned up the dependency tree". Cargo will only build one copy of ring per binary, even if different parts of the binary express dependencies upon ring with different feature sets. Cargo will union all of those feature sets and build ring only once.

This means that it is enough to add an extra (possibly redundant) ring dependency at the top level with the scarily-named feature flag enabled. That is enough for every package throughout the entire dependency tree of that particular binary to be built with the feature enabled. The Cargo.lock file contains an entry for the top-level package, but that entry is special: it has no checksum field. This makes sense, because Cargo.lock is distributed alongside the top-level package's source code. Including a checksum would be redundant. So we can add feature flags there without breaking any checksums.

I've had an extraordinarily hard time finding a formal specification of that file format (precisely how the checksum is computed, an exhaustive list of the fields, grammar for the source quasi-urls, etc...). Do you know of one?

@erichte-ibm
Copy link
Contributor Author

Cargo will union all of those feature sets and build ring only once.

Good to know!

This means that it is enough to add an extra (possibly redundant) ring dependency at the top level with the scarily-named feature flag enabled

With that knowledge, this makes sense to us at least. Hopefully we can hear back from the maintainer at some point if they agree this is an acceptable approach.

@bkeys
Copy link

bkeys commented Aug 13, 2022

I would like to have a rust desk build for ppc64le; it would be appreciated if someone reviewed this pull request if there is nothing wrong with it.

@ghost ghost mentioned this pull request Sep 12, 2022
@rjzak
Copy link

rjzak commented Sep 19, 2022

I'm thankful for this work, it's enabled me to listen to Spotify with Psst on my Talos II. So far, so good! I've tested it with some other projects as well.

@jaesharp
Copy link

jaesharp commented Oct 13, 2022

Confirmed that this project builds with hid-io/hid-io-core@5394d79 using cargo patch stanza pointed to IBM/ring@3f8f04a (IBM/[email protected]), but required reverting f06811a due to API usage in rcgen 0.9. Additionally, PR branch proposed builds with multiple other rust packages depending on ring using latest ring HEAD. Thank you to @erichte-ibm and the rest of the maintainers of the ppc64le ring port for unblocking so many packages with rcgen, webpki, and rustls dependencies!

@jaesharp
Copy link

Note for other users requiring ppc64le support: IBM/ring also implements a patchset for ring-0.16.20 ( https://github.com/IBM/ring/tree/ppc-0.16.20 ) and generally tracks main on ( https://github.com/IBM/ring/tree/main-ppc ) but the branch with which this PR is concerned ( https://github.com/IBM/ring/tree/initial-ppc64le-support-pr ) is kept fully up to date and thus also tracks IBM/ring@main-ppc .

@runlevel5
Copy link

Wondering if the core team could help review this PR. It would mean a world for the POWER community

@runlevel5
Copy link

@erichte-ibm what's the progress of this PR please? AFAIK there is also another branch https://github.com/IBM/ring/commits/ppc-0.16.20 downstream.

@erichte-ibm
Copy link
Contributor Author

To be honest, I don't know. Last I checked, ppc64le support still relies on some perlasm to work in ring, albeit less than the original pull request. If that is a blocker for initial support, then we can't make much progress here. There have been some changes though since the last time I really looked at this PR though, so when I get the chance in the next few days, I'll dive into refreshing this PR branch.

The other note was CI -- I've coincidentally be doing work on testing ppc64le code in Actions via docker/qemu, so if running tests / getting coverage is an issue as well, I can also look into adapting that here as well.

As for the IBM/ring ppc-0.16.20 branch, that exists for other projects to patch support so that they can compile on ppc64le. Right now, it's actually a cobbled mess of newer commits that causes some build issues. I will be pushing a revised version of the branch soon that is based directly on 0.16.20 that should fix it.

@briansmith
Copy link
Owner

Could you please rebase on top of main and then resubmit this so that the CI run will kick in?

I think that we need to adapt the CI changes from #1297 to also test this target in CI, especially because I believe this will be the first 64-bit little-endian target that uses some of the fallback code, so we need to make sure it is tested in CI. Originally we had some MIPS CI coverage for this code, but I had to remove it because that MIPS target got downgraded by the Rust project and broke our build.

Here is how I am going to test this PR locally:

mk/install-build-tools.sh --target=WHATEVER
mk/cargo.sh test --target=WHATEVER

Roughly you need to copy and adapt this logic in mk/install-build-tools.sh:

--target=aarch64-unknown-linux-musl|--target=armv7-unknown-linux-musleabihf)
  use_clang=1
  install_packages \
    qemu-user
  ;;

And this logic in mk/cargo.sh:

  aarch64-unknown-linux-gnu)
    export CC_aarch64_unknown_linux_gnu=clang-$llvm_version
    export AR_aarch64_unknown_linux_gnu=llvm-ar-$llvm_version
    export CFLAGS_aarch64_unknown_linux_gnu="--sysroot=/usr/aarch64-linux-gnu"
    export CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER=aarch64-linux-gnu-gcc
    export CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_RUNNER="$qemu_aarch64"
    ;;

Then in .github/ci.yml you need to add a line to the target matrix for this target. See #1297.

(Please use clang and not GCC for this CI integration.)

@briansmith
Copy link
Owner

I am intending to do a release soon, probably this week. If this PR is ready to merge before the release, I will include it. Otherwise, I am happy to do a 0.17.1 release with this.

@briansmith
Copy link
Owner

To be honest, I don't know. Last I checked, ppc64le support still relies on some perlasm to work in ring, albeit less than the original pull request. If that is a blocker for initial support, then we can't make much progress here. There have been some changes though since the last time I really looked at this PR though, so when I get the chance in the next few days, I'll dive into refreshing this PR branch.

Since we got MIPS working at one point in CI, I believe that PowerPC should work now as well.

@erichte-ibm erichte-ibm force-pushed the initial-ppc64le-support-pr branch from 69226a7 to 5cfac6a Compare September 26, 2023 21:08
@erichte-ibm
Copy link
Contributor Author

Thanks for the response! I've pushed a rebased branch with the CI support added, though I have a couple comments/questions:

  1. There seems to be an issue in the fallback code when compiling without the "alloc" feature -- bn_mul_mont_fallback.rs is not linked in, as the bigint mod is not included without "alloc", however the C files gfp_p256.c and gfp_p384.c are still compiled and expect the prefixed symbol exported. Do you have a preferred method for resolving this?
  2. Like the s390x PR, this still uses the gcc linker -- for some reason I am getting linking errors with the LLVM linker, I'm going to keep working on that. It's possible it's either my test environment, or I am calling the wrong thing.

@briansmith
Copy link
Owner

There seems to be an issue in the fallback code when compiling without the "alloc" feature -- bn_mul_mont_fallback.rs is not linked in, as the bigint mod is not included without "alloc", however the C files gfp_p256.c and gfp_p384.c are still compiled and expect the prefixed symbol exported. Do you have a preferred method for resolving this?

I will post a PR to fix this today.

Like the s390x PR, this still uses the gcc linker -- for some reason I am getting linking errors with the LLVM linker, I'm going to keep working on that. It's possible it's either my test environment, or I am calling the wrong thing.

Don't worry about it for now.

@briansmith
Copy link
Owner

Please rebase on main and resubmit. Thanks!

@erichte-ibm
Copy link
Contributor Author

Please rebase on main and resubmit. Thanks!

Done!

@runlevel5
Copy link

runlevel5 commented Sep 27, 2023

It's seems to me the coverage ci build failed due to missing token

ref: https://github.com/briansmith/ring/actions/runs/6321304636/job/17166234616?pr=1057#step:7:121

I think it is unrelated to changes of this PR

@briansmith wondering if you could manually retry that failed build. Many thanks

@briansmith
Copy link
Owner

briansmith commented Sep 29, 2023

Good news:
The refactoring to fix the --no-default-features build was merged into main.

Less good news:
There's a merge conflict in crypto/internal.h from the latest BoringSSL merge. Also, PR #1663 moved the stuff in base.h to target.h. I will merge PR #1663 ASAP so you can rebase this on top of main for the release. I am hoping to finish the BoringSSL merge today so there shouldn't be any new conflicts.

@erichte-ibm erichte-ibm force-pushed the initial-ppc64le-support-pr branch from 0362bf3 to 0e36904 Compare September 30, 2023 14:26
@erichte-ibm
Copy link
Contributor Author

Rebased on top of new main, moved definitions to target.h.

@briansmith
Copy link
Owner

Sorry, this is a nightmare for merge conflicts because the merges I have are riscv (done), ppc64le, and s390x, which alphabetically are adjacent, so every change conflicts.

Please rebase one more time and I will do my best to merge this before I merge another conflict. Then I will merge s390x after that is rebased.

Sorry for the inconveniences.

Other architectures may not need to do feature checks, and therefore
Feature::available, Feature::mask, etc are never used/read.

This snippet mirrors the similar bit at the top of cpu/intel.

Signed-off-by: Eric Richter <[email protected]>
@erichte-ibm erichte-ibm force-pushed the initial-ppc64le-support-pr branch from 0e36904 to 5c496c5 Compare September 30, 2023 23:08
@erichte-ibm
Copy link
Contributor Author

No worries! Rebased, and fixed the conflicts, preserving the alphabetic order of the targets.

@briansmith briansmith merged commit 3161c01 into briansmith:main Oct 1, 2023
150 of 151 checks passed
@briansmith
Copy link
Owner

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants