-
Notifications
You must be signed in to change notification settings - Fork 711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge BoringSSL through d605df5b6f8462c1f3005da82d718ec067f46b70 #1659
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This isn't captured by the comments, the ABI tests have technically been going out of bounds, and is entirely unnecessary. It can just take Htable as a parameter. Change-Id: Iad748d5b649333985ebaa1f84031fbe9a2339a85 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59505 Auto-Submit: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]> Commit-Queue: Adam Langley <[email protected]>
…4.pl This is a little trickier because Intel architectures are so inconveniently register-starved. This code was already using every register. However, since Xi is only needed at the start and end of the function, I just swapped the Xi and Htable parameters. Xi is passed on the stack, so we don't need to explicitly spill it. Change-Id: I2ef4552fc181a5350c9b1c733cf2319377a06b74 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59525 Reviewed-by: Adam Langley <[email protected]> Commit-Queue: David Benjamin <[email protected]>
This was originally removed in https://boringssl-review.googlesource.com/12477, but restored in https://boringssl-review.googlesource.com/c/boringssl/+/13122, which also restored a comment the assembly code secretly relies on the struct layout. Our comment references the MOVBE-based assembly, which could mean either the stitched AES+GCM code, or the GHASH-only code. I don't see an obvious place where the GHASH-only code does this. The stitched ones (both x86_64 and aarch64, counter to the comment) did this, but the preceding CLs fix this. I think we can now remove this comment. In OpenSSL, this comment dates to d8d958323bb116bf9f88137ba46948dcb1691a77 in upstream. That commit also removed a field, so we can assume no assembly prior to that change relied on that layout. That commit immediately preceded 1e86318091817459351a19e72f7d12959e9ec570, which was likely the motivation. At the time, gcm_gmult_neon had some code with a comment "point at H in GCM128_CTX". At the time, Htable wasn't even filled in, and the function relied on H being 16 bytes before Htable. gcm_ghash_neon also relies on them being reachable from Xi. This code changed in f8cee9d08181f9e966ef01d3b69ba78b6cb7c8a8 and, at a glance, the file doesn't seem to be making this assumption anymore. I think, prior to that change, Htable wasn't filled in for the NEON code at all. With all this now resolved, remove the comment and unused copy of H in GCM128_KEY. Change-Id: I4eb16cfff5dd41a59a0231a998d5502057336215 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59526 Reviewed-by: Adam Langley <[email protected]> Commit-Queue: David Benjamin <[email protected]>
This can be done with just memcpy, which tempts the compiler slightly less. Bug: 574 Change-Id: I7b46c2f2578abc85621834426de20d5eaf492a74 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59527 Reviewed-by: Adam Langley <[email protected]> Commit-Queue: David Benjamin <[email protected]>
The initial codepoint is called X25519Kyber786Draft00 in the draft, so align with that name for this version. Also remove the placeholder bits for the other combinations, which haven't gotten that far yet. Update-Note: Update references to NID_X25519Kyber768 to NID_X25519Kyber768Draft00. For now, the old name is available as an alias. Change-Id: I2e531947f41e589cec61607944dca844722f0947 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59605 Auto-Submit: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]> Commit-Queue: David Benjamin <[email protected]>
Fixed: 605 Change-Id: Id2b7d0221c3e43c102ce77c800f7ab65c74e0582 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59625 Commit-Queue: Adam Langley <[email protected]> Reviewed-by: Adam Langley <[email protected]> Auto-Submit: David Benjamin <[email protected]>
EC_RAW_POINT is a confusing name. It's mostly about whether this is stack-allocated EC_POINT without the EC_GROUP pointer. Now that we have EC_AFFINE, EC_JACOBIAN captures what it's doing a bit better. Fixed: 326 Change-Id: I5b71a387e899a94c79be8cd5e0b54b8432f7d5da Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59565 Reviewed-by: Adam Langley <[email protected]> Auto-Submit: David Benjamin <[email protected]> Commit-Queue: Adam Langley <[email protected]>
This was a bit of a mess. There are three assembly functions to juggle here. Their current type signatures are: void gcm_init_v8(u128 Htable[16], const uint64_t H[2]); void gcm_gmult_v8(uint64_t Xi[2], const u128 Htable[16]); void gcm_ghash_v8(uint64_t Xi[2], const u128 Htable[16], const uint8_t *inp, size_t len); Except for gcm_nohw.c, this is all assembly, so they don't follow the C abstract machine's theory of typed memory. That means types are mostly arbitrary and we have room to rearrange them. They do carry an implicit alignment requirement, but none of these assembly files care about this[*]. Values passed to gcm_gmult and gcm_ghash get XORed byte-by-byte in places, which is inconvenient to do as uint64_t. They also get passed to AES functions, which want bytes. Thus I think uint8_t[16] is the most natural and convenient type to use. H in gcm_init is interesting. gcm_init already doesn't take a GHASH key in the natural byte representation. The two 8-byte halves are byte-swapped, but the halves are not swapped, so it's not quite a byte reversal. I opted to leave that as uint64_t[2], mostly to capture that something odd is happening here. [*] We only have GHASH assembly for x86, x86_64, armv7, and aarch64. We used to have armv4 GHASH assembly, but that's been removed from gcm_nohw.c. Thus we can assume none of these files care about alignment for plain scalar loads. Alignment does matter for vmovdqa vs vmovdqu, but that requires 16-byte alignment and uint64_t only implies 4- or 8-byte alignment on these architectures. Bug: 574 Change-Id: If7dba9b41ff62204f4cf8fcd54eb4a4c54214c6e Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59528 Commit-Queue: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]>
We cap e in RSA for DoS reasons. draft-amjad-cfrg-partially-blind-rsa needs to create RSA keys with very large e. To support this, add an API which disables this check. Also add some missing checks for negative n and negative e. (Already rejected by the parser, just not at this layer.) Change-Id: Ia996bb1b46fc8b73db704f492b3df72b254a15a4 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59645 Auto-Submit: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]> Commit-Queue: Adam Langley <[email protected]>
This generalizes the scheme we previously had with TLS_RSA_WITH_NULL_SHA, in preparation for TLS_RSA_WITH_3DES_EDE_CBC_SHA (to be deprecated in a later CL) and TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 (to regretably be added back in, but in a deprecated state). The story is that deprecated ciphers can always be listed by name, but by default, selectors will not match them when adding ciphers. They will still match them when removing ciphers. This is so instructions like "@strength" or "!RSA" will still sort or disable the deprecated ciphers, rather than accidentally leaving them at the front or enabled. Additionally, a selector can mark itself as including deprecated ciphers. This is specifically for TLS_RSA_WITH_3DES_EDE_CBC_SHA, because many callers reference it with just "3DES". As an added quirk, "RSA+3DES" will also match it. (The rule is that, if any selector matches deprecated ciphers, we'll allow the overall expression to match it. This is slightly weird, but keeps "RSA+3DES" working.) Note, in this CL, 3DES is not actually deprecated. This just sets up the machinery and doesn't do anything with it. But the blockers for deprecating that should hopefully be resolved soon. Bug: 599 Change-Id: I7212bdc879b0e49c6742025644f3100026f24228 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59646 Auto-Submit: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]> Commit-Queue: Adam Langley <[email protected]>
rsa_test.cc and a few of the tests in evp_tests.txt still do, but refresh the easy one. I switched it to a P-256 key to keep the input small. Bug: 607 Change-Id: Ie0614280d12012bbd928f095eb4531e4d7550ae1 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59666 Auto-Submit: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]> Commit-Queue: Adam Langley <[email protected]>
This TLS 1.2 algorithm is substantially inferior to AES-GCM and should never be used. It will not be available unless configured by name. However, in can be used to provide backwards-compatibility with devices that cannot be updated if so needed. Change-Id: I1fd78efeb33aceca76ec2e7cb76b70f761ed1af8 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59585 Reviewed-by: David Benjamin <[email protected]> Commit-Queue: Adam Langley <[email protected]> Auto-Submit: Adam Langley <[email protected]>
Envoy needs to have the possible cipher, etc., strings predeclared to reduce synchronization needs in the steady state. It currently does this by (1) iterating over SSL_CTX_get_ciphers at SSL_CTX creation time and (2) hard-coding a lists of known TLS 1.3 ciphers, TLS versions, NamedGroups, etc. (1) would work for some applications, but it breaks any applications that configure ciphers on the SSL on a certificate callback, etc. If the callback configures a cipher that wasn't configured on the SSL_CTX (e.g. if the SSL_CTX were left at defaults), Envoy's logging breaks and we hit an ENVOY_BUG assertion. (2) breaks whenever BoringSSL adds a new feature. In principle, we could update Envoy when updating BoringSSL, but this is an unresasonable development overhead for just one of many BoringSSL consumers to impose. Such costs are particularly high when considering needing to coordinate updates to Envoy and BoringSSL across different repositories. Add APIs to enumerate the possible strings these functions can return. These string lists are a superset of those that any one application may care about (e.g. we may have a deprecated cipher that Envoy no longer needs, or an experimental cipher that's not yet ready for Envoy's stability goals), but this is fine provided this is just used to initialize the table. In particular, they are *not* intended to enumerate supported features. Bump BORINGSSL_API_VERSION to aid in patching these into Envoy. Bug: b:280350955 Change-Id: I4d11db980eebed5620d3657778c09dbec004653c Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59667 Commit-Queue: Adam Langley <[email protected]> Auto-Submit: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]>
This aligns with google/oss-policies-info#8 and grpc/grpc#32614. VS2019 adds a C11 mode, which is useful for us, because it means stdalign.h works correctly. Also bump the minimum Windows SDK to https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-support-arriving-in-msvc/. If you have a new MSVC, CMake will enable C11 mode by default. But if C11 mode is enabled but your Windows SDK is too old, things break. After this change, the CI will include some redundant configurations. All the VS2017 configurations will start testing on VS2019, so the VS2019-specific configurations won't do anything. I'll follow this up with a change to bump those to VS2022, where we're currently missing coverage. Update-Note: BoringSSL now requires VS2019 or later and no longer supports VS2017. VS2017 has been past its "mainstream end date" for over a year now, per https://learn.microsoft.com/en-us/lifecycle/products/visual-studio-2017 Change-Id: I3f359e8ea7c9428ddaa9fcc4ffead2ef903398be Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59665 Reviewed-by: Adam Langley <[email protected]> Commit-Queue: David Benjamin <[email protected]>
Bug: 574 Change-Id: Ief8030a4c9257a05defdc2dc02d496c63f06626a Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59545 Commit-Queue: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]>
Includes bits by me and Robert Nagy <[email protected]> who has a google CLA. Update-Note: Additionally, BoringSSL now requires macOS 10.12 or later for getentropy support. This is consistent with https://github.com/google/oss-policies-info/blob/main/foundational-cxx-support-matrix.md WANT_LGTM=all Change-Id: I5ab74fa8a6677fac29c316aa29a954df401ba647 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59225 Commit-Queue: Bob Beck <[email protected]> Reviewed-by: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]>
Change-Id: I1a4914cc4859ed1e0797a046a1abec8921c4a9cf Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59746 Commit-Queue: Bob Beck <[email protected]> Reviewed-by: Bob Beck <[email protected]> Auto-Submit: David Benjamin <[email protected]>
It's been a couple years since we did that. I had to upstream a fix to CreateArgvFromArgs, but that landed quickly, so we're actually carrying no googletest patches with this. Change-Id: Ifaf3476f2079d444512a235756cfe54f492b9c92 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59745 Reviewed-by: Bob Beck <[email protected]> Commit-Queue: Bob Beck <[email protected]>
This causes avcptool to send requests without blocking on responses. See the diff in ACVP.md for details of how to use this feature. Change-Id: I922b3bd2383cb7d22a5d12ead49d2fa733ee1b97 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/55345 Reviewed-by: David Benjamin <[email protected]> Commit-Queue: Adam Langley <[email protected]>
Trying to fix all the places where these formats go quadratic isn't a good use of time. We've already documented that they're not safe for use with untrusted inputs. Even without such DoS issues, they cannot be safely used anyway. (E.g. RUSTSEC-2023-0023.) Just cap the fuzzer input. It'd be nice if we could avoid this more systematically in the function, but they're not structured to make this easy to do, and anyone concerned about DoS in this function has worse problems. Bug: chromium:1444420, oss-fuzz:56048, 611 Change-Id: I53eeb346f59278ec2db3aac4a92573b927ed8003 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59785 Reviewed-by: Bob Beck <[email protected]> Commit-Queue: David Benjamin <[email protected]> Auto-Submit: David Benjamin <[email protected]>
Saves some duplicated logic. Change-Id: I202fa92a88101f9ad735648bc414ab05752641da Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59685 Commit-Queue: David Benjamin <[email protected]> Reviewed-by: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]>
This is inspired somewhat from how https://github.com/google/benchmark's threaded benchmark support works. (It seems to spawn a bunch of threads, latch them all together, and then run.) This adds a TimeFunctionParallel which runs multiple copies of the benchmark in parallel, after waiting for all the threads to synchronize. Some functions had to be tweaked so they don't write to a single, shared output buffer. This probably could use some improvement. In playing with it, the numbers are pretty unstable. We also don't currently benchmark anything that captures EVP's internal refcounts. But hopefully it's enough to get a start. I am able to measure impacts from the PRNG locks at least. Bug: 570 Change-Id: I92c29a05ba082fc45701afd6f0effe23f7b148bd Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59845 Reviewed-by: Adam Langley <[email protected]> Commit-Queue: David Benjamin <[email protected]> Auto-Submit: David Benjamin <[email protected]>
Change-Id: Ibcc3faa4c3e03713a0c8f6dc24119e4579a5b4e4 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59025 Commit-Queue: Bob Beck <[email protected]> Reviewed-by: David Benjamin <[email protected]>
RSATest.MissingParameters tests this case a bit more extensively. Change-Id: I61e38bd139c6334ff9d5c9636a159cb20bcd7f7b Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59825 Commit-Queue: Adam Langley <[email protected]> Reviewed-by: Adam Langley <[email protected]> Auto-Submit: David Benjamin <[email protected]>
We currently don't enforce rsa_check_public_key invariants on private key operations, only public key operations and RSA_check_key. This means it was actually possible, in some corner cases, to perform operations with oversized e or n. Fix this. This gets us a bit closer to aligning the mess of RSA invariants. (RSA_check_key still performs some extra checks, but most of those should be redundant with the CRT self-check.) Update-Note: Manually constructed RSA private keys with invalid n or e will now fail private key operations. Such keys would always fail at public key operations (so the signatures would never verify). They also already failed RSA_check_key and parsing. The one incompatibility of note is keys with only n and d, constructed by reaching into the internal RSA struct, no longer work. Instead, use RSA_new_private_key_no_e. Conscrypt used to do this but has since been migrated to the new API. Bug: 316 Change-Id: I062fdad924b8698e257dab9760687e4b381c970d Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59826 Reviewed-by: Adam Langley <[email protected]> Commit-Queue: David Benjamin <[email protected]>
512-bit RSA was factored in 1999, so this limit barely means anything. But establish some limit now to ratchet in what we can. We'll raise this limit as we clear through further rounds of bad keys in tests. As part of this, I've touched up rsa_test.cc a bit. All the functions that made assumptions on key size now use std::vector with RSA_size. kKey1 and kKey2 were also 512- and 400-bit RSA, respectively. In principle, we could keep kKey1 for now, but the next stage will break it anyway. I've replaced them with kFIPSKey (which was "FIPS-compliant" but actually 1024-bit) and kTwoPrime (remnant of multi-prime RSA, 2048-bit). As neither name makes sense, they're just the new kKey1 and kKey2. I've also switched from string literals to arrays, which avoids the pesky trailing NUL. Sadly, it is a bit more verbose. Maybe we should switch to writing something like: const std::vector<uint8_t> kKey1 = MustDecodeHex("abcdef1234..."); Static initializers don't matter in tests, after all. Update-Note: We no longer accept 511-bit RSA and below. If you run into this, update test keys to more modern sizes as we plan to raise the limit beyond 512-bit RSA in the future. 512-bit RSA was factored in 1999, so keys at or near this limit have been obsolete for a very, very long time. Bug: 607 Change-Id: I13c3366d7e5f326710f1d1b298f4150a4e8e4d78 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59827 Commit-Queue: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]>
Right now, MSVC has to fallback to refcount_lock.c, which uses a single, global lock for all refcount operations. Instead, use the Interlocked* APIs to implement them. The motivation is two-fold. First, this removes a performance cliff when building for Windows on a non-Clang compiler. (Although I've not been able to measure it in an end-to-end EVP benchmark, only a synthetic refcount-only benchmark.) More importantly, it gets us closer to assuming atomics support on all non-NO_THREADS configurations. (The next CL will clear through that.) That, in turn, will make it easier to add an atomics-like abstractions to some of our hotter synchronization points. (Even in newer glibc, with its better rwlock, read locks fundamentally need to write to memory, so we have some cacheline contention on shared locks.) Annoyingly, the Windows atomic_load replacement is not quite right. I've used a "no-op" InterlockedCompareExchange(p, 0, 0) which, empirically, still results in a write. But a write to the refcount cacheline is surely better than taking a global exclusive lock. See comments in file for details. OpenSSL uses InterlockedOr(p, 0), but that actually results in even worse code. (InterlockedOr needs a retry loop when the underlying cmpxchg fails, whereas InterlockedCompareExchange is a single cmpxchg.) Hopefully, in the future (perhaps when we require VS 2022's successor, based on [1]), this can be removed in favor of C11 atomics everywhere. [1] https://devblogs.microsoft.com/cppblog/c11-atomics-in-visual-studio-2022-version-17-5-preview-2/ Bug: 570 Cq-Include-Trybots: luci.boringssl.try:linux_clang_rel_tsan Change-Id: I125da139e2fd3ae51e54309309fda16ba97ccf20 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59846 Commit-Queue: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]>
On Windows, we can rely on Interlocked APIs. On non-Windows builds, we currently require C11 but permit C11 atomics to be missing, via __STDC_NO_ATOMICS__. This CL tightens this so C11 atomics are required on non-MSVC builds. My hope is that, now that we require C11 on non-Windows, this is a fairly safe requirement. We already require pthreads on any platform where this might apply, and it's hard to imagine someone has C11, pthreads, but not C11 atomics. This change means that, in later work, we can refactor the refcount logic to instead be a <stdatomic.h> compatibility layer, and then an atomics-targetting CRYPTO_refcount_t implementation. With a <stdatomic.h> compatibility layer, we can use atomics in more places, notably where our uses of read locks are causing cacheline contention. The platform restriction isn't *strictly* necessary. We could, like with refcounts, emulate <stdatomic.h> with a single, global lock. Indeed any platforms in this situation have already been living with that lock for refcounts without noticing. But then later work to add "atomics" to read locks would regress contention for those platforms. So I'm starting by rejecting this, so if any such platform exists, we can understand their performance needs before doing that. Update-Note: On non-Windows platforms, we now require C11 atomics support. Note we already require C11 itself. If this affects your build, get in touch with BoringSSL maintainers. Bug: 570 Cq-Include-Trybots: luci.boringssl.try:linux_clang_rel_tsan Change-Id: I868fa4ba87ed73dfc9d52e80d46853ef56715a5f Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59847 Commit-Queue: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]>
refcount.c is now a single, generic file that calls into C11-atomic-like APIs. Behind the scenes, this selects one of C11 atomics, Windows interlocked APIs, or unsynchronized reads/writes (in the no-threads build). This frees us up to use atomics elsewhere in the library. For now, this only binds sequentially consistent atomics, but we can add other memory orders if needed. In particular, I believe up_ref only needs relaxed atomics. Some of the later change I think only need acquire and release, but I'm not positive. Bug: 570 Cq-Include-Trybots: luci.boringssl.try:linux_clang_rel_tsan Change-Id: Ifcd7357611bb7a8cd14b82c23ad080d1a2df1386 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59848 Commit-Queue: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]>
We don't take write locks in the PRNG, steady state, but we do take some read locks: computing fork generation, reading the fork-unsafe buffering flag, and a FIPS-only artifact of some global state clearing mess. That last one is completely useless, but it's a consequence of FIPS's understanding of process exit being comically inconsistent with reality. Taking read locks is, in principle, parallel, but the cacheline write causes some contention, even in newer glibcs with faster read locks. Fix these: - Use atomic reads to check the fork generation. We only need to lock when we observe a fork. - Replace the fork-unsafe buffering flag with an atomic altogether. - Split state_clear_all_lock into a per-rand_thread_state lock. We still need a read lock, but a completely uncontended one until process exit. With many threads, this gives a significant perf boost. x86_64, non-FIPS, Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz, 30 threads: Before: Did 45131875 RNG (16 bytes) operations in 300039649us (150419.7 ops/sec): 2.4 MB/s Did 44089000 RNG (32 bytes) operations in 300053237us (146937.3 ops/sec): 4.7 MB/s Did 43328000 RNG (256 bytes) operations in 300058423us (144398.5 ops/sec): 37.0 MB/s Did 45857000 RNG (1350 bytes) operations in 300095943us (152807.8 ops/sec): 206.3 MB/s Did 43249000 RNG (8192 bytes) operations in 300102698us (144114.0 ops/sec): 1180.6 MB/s After: Did 296204000 RNG (16 bytes) operations in 300009524us (987315.3 ops/sec): 15.8 MB/s Did 311347000 RNG (32 bytes) operations in 300014396us (1037773.5 ops/sec): 33.2 MB/s Did 295104000 RNG (256 bytes) operations in 300012657us (983638.5 ops/sec): 251.8 MB/s Did 255721000 RNG (1350 bytes) operations in 300016481us (852356.5 ops/sec): 1150.7 MB/s Did 103339000 RNG (8192 bytes) operations in 300040059us (344417.3 ops/sec): 2821.5 MB/s (Smaller PRNG draws are more impacted because they spend less time in the DRBG. But they're also more likely because you rarely need to pull 8K of data out at once.) x86_64, FIPS, Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz, 30 threads: Before: Did 29060000 RNG (16 bytes) operations in 300081190us (96840.5 ops/sec): 1.5 MB/s Did 31882000 RNG (32 bytes) operations in 300118031us (106231.5 ops/sec): 3.4 MB/s Did 30925000 RNG (256 bytes) operations in 300113646us (103044.3 ops/sec): 26.4 MB/s Did 31969000 RNG (1350 bytes) operations in 300096688us (106529.0 ops/sec): 143.8 MB/s Did 33434000 RNG (8192 bytes) operations in 300093240us (111412.0 ops/sec): 912.7 MB/s After: Did 299013000 RNG (16 bytes) operations in 300012167us (996669.6 ops/sec): 15.9 MB/s Did 289788000 RNG (32 bytes) operations in 300014611us (965913.0 ops/sec): 30.9 MB/s Did 298699000 RNG (256 bytes) operations in 300013443us (995618.7 ops/sec): 254.9 MB/s Did 247061000 RNG (1350 bytes) operations in 300018215us (823486.7 ops/sec): 1111.7 MB/s Did 100479000 RNG (8192 bytes) operations in 300037708us (334887.9 ops/sec): 2743.4 MB/s On an M1 Pro, it's mostly a wash by default (fewer threads because this chip has fewer cores) aarch64, M1 Pro, 8 threads: Before: Did 23218000 RNG (16 bytes) operations in 80009131us (290191.9 ops/sec): 4.6 MB/s Did 23021000 RNG (256 bytes) operations in 80007544us (287735.4 ops/sec): 73.7 MB/s Did 22853000 RNG (1350 bytes) operations in 80013184us (285615.4 ops/sec): 385.6 MB/s Did 25407000 RNG (8192 bytes) operations in 80008371us (317554.3 ops/sec): 2601.4 MB/s Did 22128000 RNG (16384 bytes) operations in 80013269us (276554.1 ops/sec): 4531.1 MB/s After: Did 23303000 RNG (16 bytes) operations in 80011433us (291245.9 ops/sec): 4.7 MB/s Did 23072000 RNG (256 bytes) operations in 80008755us (288368.4 ops/sec): 73.8 MB/s Did 22807000 RNG (1350 bytes) operations in 80013355us (285039.9 ops/sec): 384.8 MB/s Did 23759000 RNG (8192 bytes) operations in 80010212us (296949.6 ops/sec): 2432.6 MB/s Did 23193000 RNG (16384 bytes) operations in 80011537us (289870.7 ops/sec): 4749.2 MB/s This is likely because, without RDRAND or MADV_WIPEONFORK, we draw from the OS on every call. We're likely bottlenecked by getentropy, whether it's some internal synchronization or syscall overherad. With fork-unsafe buffering enabled, this change shows even more significant wins on the M1 Pro. aarch64, fork-unsafe buffering, M1 Pro, 8 threads: Before: Did 25727000 RNG (16 bytes) operations in 80010579us (321545.0 ops/sec): 5.1 MB/s Did 25776000 RNG (32 bytes) operations in 80008587us (322165.4 ops/sec): 10.3 MB/s Did 25780000 RNG (256 bytes) operations in 80006127us (322225.3 ops/sec): 82.5 MB/s Did 33171250 RNG (1350 bytes) operations in 80002532us (414627.5 ops/sec): 559.7 MB/s Did 54784000 RNG (8192 bytes) operations in 80005706us (684751.2 ops/sec): 5609.5 MB/s After: Did 573826000 RNG (16 bytes) operations in 80000668us (7172765.1 ops/sec): 114.8 MB/s Did 571329000 RNG (32 bytes) operations in 80000423us (7141574.7 ops/sec): 228.5 MB/s Did 435043750 RNG (256 bytes) operations in 80000214us (5438032.3 ops/sec): 1392.1 MB/s Did 229536000 RNG (1350 bytes) operations in 80001888us (2869132.3 ops/sec): 3873.3 MB/s Did 57253000 RNG (8192 bytes) operations in 80004974us (715618.0 ops/sec): 5862.3 MB/s Note that, on hardware with RDRAND, the read lock in rand_fork_unsafe_buffering_enabled() doesn't do much. But without RDRAND, we hit that on every RAND_bytes call. More importantly, the subsequent CL will fix a bug that will require us to hit it more frequently. I've removed the volatile on g_fork_detect_addr because I think we didn't need it and this avoids thinking about the interaction between volatile and atomics. The pointer is passed into madvise, so the compiler knows the pointer escapes. For it to be invalid, the compiler would need to go out of its way to model madvise as not remembering the pointer, which would be incorrect of it for MADV_WIPEONFORK. Bug: 570 Cq-Include-Trybots: luci.boringssl.try:linux_clang_rel_tsan Change-Id: Ie6977acd1b8e7639aaa419cf6f4f5f0645bde9d1 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59849 Commit-Queue: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]>
Change-Id: I7458b1d7aa1736d586dc80660d59c07fa2ac1c8a Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59805 Reviewed-by: Bob Beck <[email protected]> Reviewed-by: David Benjamin <[email protected]> Commit-Queue: David Benjamin <[email protected]>
Update-Note: SSL_CIPHER_get_value was our original name for the function. OpenSSL later called it SSL_CIPHER_get_protocol_id. I believe all external callers have since been updated to use the new function. (If I missed a few stragglers, replace with SSL_CIPHER_get_protocol_id to fix.) Change-Id: I956fb49bf2d13a898eed73177493d2c8d50778ad Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60205 Reviewed-by: Bob Beck <[email protected]> Auto-Submit: David Benjamin <[email protected]> Commit-Queue: Bob Beck <[email protected]>
We're this awkward mix of "group" and "curve" right now. On the spec side, this is because they used to be "curves", but then RFC 7919 renamed to "group" in an attempt to generalize FFDH and ECDH. The negotiated FFDH stuff never really went anywhere (the way it used cipher suite values in TLS 1.2 made it unusable), but the name change stuck. In our implementation and API, we originally called it "curve". In preparation for TLS 1.3, we renamed the internals to "group" to match the spec in https://boringssl-review.googlesource.com/c/boringssl/+/7955, but the public API was still "curve". Then we exported a few more things in https://boringssl-review.googlesource.com/c/boringssl/+/8565, but I left it at "curve" to keep the public API self-consistent. Then we added OpenSSL's new "group" APIs in https://boringssl-review.googlesource.com/c/boringssl/+/54306, but didn't go as far to deprecate the old ones yet. Now I'd like to add new APIs to clear up the weird mix of TLS codepoints and NIDs that appear in our APIs. But our naming is a mess, so either choice of "group" or "curve" for the new API looks weird. In hindsight, we probably should have left it at "curve". Both terms are equally useless for the future post-quantum KEMs, but at least "curve" is more unique of a name than "group". But at this point, I think we're too far through the "group" rename to really backtrack: - Chromium says "group" in its internals - QUICHE says "group" in its internals and public API - Our internals say "group" - OpenSSL has switched to "group" and deprecated "curve", so new APIs will be based on "group" So align with all this and say "group". This CL handles set1_curves and set1_curves_list APIs, which already have "group" replacements from OpenSSL. A follow-up CL will handle our APIs. This is a soft deprecation because I don't think updating things is particularly worth the churn, but get the old names out of the way, so new code can have a simpler API to target. Also rewrite the documentation for that section accordingly. I don't think we need to talk about how it's always enabled now. That's a reference to some very, very old OpenSSL behavior where ECDH negotiation needed to be separately enabled. Change-Id: I7a356793d36419fc668364c912ca7b4f5c6c23a2 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60206 Commit-Queue: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]> Reviewed-by: Bob Beck <[email protected]>
This adds "group" versions of our codepoint-based APIs. This is mostly because it looks weird to switch terminology randomly in the implementation. See I7a356793d36419fc668364c912ca7b4f5c6c23a2 for additional context. I've not bothered renaming the bssl_shim flags. That seems a waste of time. Change-Id: I566ad132f5a33d99efd8cb2bb8b76d9a6038c825 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60207 Reviewed-by: Adam Langley <[email protected]> Commit-Queue: David Benjamin <[email protected]>
Right now we use NIDs to configure the group list, but group IDs (the TLS codepoints) to return the negotiated group. The NIDs come from OpenSSL, while the group ID was original our API. OpenSSL has since added SSL_get_negotiated_group, but we don't implement it. To add Kyber to QUIC, we'll need to add an API for configuring groups to QUICHE. Carrying over our inconsistency into QUICHE's public API would be unfortunate, so let's use this as the time to align things. We could either align with OpenSSL and say NIDs are now the group representation at the public API, or we could add a parallel group ID API. (Or we could make a whole new SSL_NAMED_GROUP object to pattern after SSL_CIPHER, which isn't wrong, but is even more new APIs.) Aligning with OpenSSL would be fewer APIs, but NIDs aren't a great representation. The numbers are ad-hoc and even diverge a bit between OpenSSL and BoringSSL. The TLS codepoints are better to export out to callers. Also QUICHE has exported the negotiated group using the codepoints, so the natural solution would be to use codepoints on input too. Thus, this CL adds SSL_CTX_set1_group_ids and SSL_set1_group_ids. It also rearranges the API docs slightly to put the group ID ones first, and leaves a little note about the NID representation before introducing those. While I'm here, I've added SSL_get_negotiated_group. NGINX seems to use it when available, so we may as well fill in that unnecessary compatibility hole. Bug: chromium:1442377 Change-Id: I47ca8ae52c274133f28da9893aed7fc70f942bf8 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60208 Commit-Queue: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]>
Many of our point addition functions internally check for the doubling case and branch because the addition formulas are incomplete. This branch is fine because the multiplication formulas are arranged to not hit this case. However, we don't want to leak the couple of intermedate values that determine whether to branch. Previously, we ran into this with https://boringssl-review.googlesource.com/c/boringssl/+/36465. This wasn't sufficient. The compiler understands if (a & b) enough to compile into two branches. Thanks to Moritz Schneider, Nicolas Dutly, Daniele Lain, Ivan Puddu, and Srdjan Capkun for reporting this! Fix the leak by adding a value barrier on the final value. As we're also intentionally leaking the result of otherwise secret data flow, I've used the constant_time_declassify functions, which feed into our valgrind-based constant-time validation and double as barriers. Accordingly, I've also added some CONSTTIME_SECRET markers around the ECDSA nonce value, so we can check with valgrind the fix worked. The marker really should be at a lower level, at ec_random_nonzero_scalar or further (maybe RAND_bytes?), but for now I've just marked the nonce. To then clear valgrind, add constant_time_declassify in a few other places, like trying to convert infinity to affine coordinates. (ECDH deals with secret points, but it is public that the point isn't infinity.) Valgrind now says this code is constant-time, at least up to compilation differences introduced by the annotations. I've also inspected the compiler output. This seems to be fine, though neither test is quite satisfying. Ideally we could add annotations in ways that don't influence compiler output. Change-Id: Idfc413a75d92514717520404a0f5424903cb4453 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60225 Reviewed-by: Adam Langley <[email protected]> Commit-Queue: Adam Langley <[email protected]> Auto-Submit: David Benjamin <[email protected]>
All inputs are marked as secret. This is not to support a use case for calling X25519 with a secret *point* as the input, but rather to ensure that the choice of the point cannot influence whether the scalar is leaked or not. Same for the initial contents of the output buffer. This is a conservative choice and may be revised in the future. Change-Id: I595d454a8e1fdc409912aee751bb0b3cf46f5430 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60186 Reviewed-by: David Benjamin <[email protected]> Commit-Queue: David Benjamin <[email protected]>
NGINX requires this constant if SSL_get_negotiated_group is defined. OpenSSL returns this to indicate a named group constant it doesn't understand, which doesn't make sense out of SSL_get_negotiated_group because the library wouldn't negotiate a group it doesn't know about. Nonetheless, define it for compatibility. Fixed: 615 Change-Id: I05a6d607912bb8507be9ac6ff5fe1699b4715f6b Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60326 Commit-Queue: Adam Langley <[email protected]> Commit-Queue: David Benjamin <[email protected]> Reviewed-by: Adam Langley <[email protected]> Auto-Submit: David Benjamin <[email protected]>
Andres Erbsen noticed we didn't actually have tests to catch when the format macros were wrong. In doing so, remove BN_DEC_FMT2. It was unused and only makes sense in the context of the bignum-to-decimal conversion, where we no longer use it anyway. None of these macros are exported in OpenSSL at all, so it should be safe to remove it. (We possibly can remove the others too. I see one use outside the library, and that use would probably be better written differently anyway.) Change-Id: I4ffc7f9f7dfb399ac060af3caff0778000010303 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60325 Reviewed-by: Bob Beck <[email protected]> Commit-Queue: Bob Beck <[email protected]> Auto-Submit: David Benjamin <[email protected]>
BN_nnmod() can deal with the situation that the first and the second arguments are the same, but it cannot deal with the first and the second argument being equal. In that situation, if BN_mod(x, y, x, ctx) results in a negative x, then the result of BN_nnmod() is zero. This breaks the strange BN_mod_inverse(m, a, m, ctx). Reported by Guido Vranken in openssl/openssl#21110 Change-Id: I8584720660f214f172b3b33716a5e3b29e8f2fd8 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60365 Reviewed-by: Bob Beck <[email protected]> Reviewed-by: David Benjamin <[email protected]> Commit-Queue: David Benjamin <[email protected]>
Bug: 285222360 Change-Id: I7f35bcc734dd853e99cb691bdc681f75c9f137e4 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60265 Reviewed-by: Bob Beck <[email protected]> Commit-Queue: Bob Beck <[email protected]>
Did 59000 Ed25519 key generation operations in 1004188us (58753.9 ops/sec) [+8.3%] Did 57000 Ed25519 signing operations in 1005649us (56679.8 ops/sec) [+7.9%] Did 19000 Ed25519 verify operations in 1054380us (18020.1 ops/sec) [-2.0%] Did 61000 Curve25519 base-point multiplication operations in 1007401us (60551.9 ops/sec) [+8.3%] Did 22000 Curve25519 arbitrary point multiplication operations in 1022882us (21507.9 ops/sec) [+0.5%] Change-Id: I14668f658b1ae99850cb0f8938f90f988d0edd0b Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60107 Reviewed-by: David Benjamin <[email protected]> Commit-Queue: David Benjamin <[email protected]>
…M128_CONTEXT layout in aesv8-gcm-armv8.pl. This is modifying not-yet-used code.
…T layout in aesni-gcm-x86_64.pl.
Commit 784fa29 should have deleted this comment. The analogous comment was deleted in BoringSSL in that merged commit.
…r curve25519. Don't add the constant-time validation tests since we need to develop the framework for it first. Do add the public-from-private test.
Originally I was trying to be pedantic and avoid any use of `_t`- suffixed names. However, this hasn't really accomplished anything except annoying me, so just do what BoringSSL does.
Bring these in as they were in 4a0393f. The next merge will modify these.
*ring* doesn't use the BoringSSL code that uses these constant-time utilities.
Codecov Report
@@ Coverage Diff @@
## main #1659 +/- ##
==========================================
- Coverage 92.09% 92.07% -0.03%
==========================================
Files 132 132
Lines 18869 18921 +52
Branches 196 199 +3
==========================================
+ Hits 17378 17421 +43
- Misses 1455 1463 +8
- Partials 36 37 +1
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.