-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: improve ManagedAuthenticatedEncryptor Decrypt() and Encrypt() flow #59424
base: main
Are you sure you want to change the base?
perf: improve ManagedAuthenticatedEncryptor Decrypt() and Encrypt() flow #59424
Conversation
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Outdated
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/KeyManagement/KeyManagementOptions.cs
Outdated
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Outdated
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Outdated
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Outdated
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/SP800_108/ManagedSP800_108_CTR_HMACSHA512.cs
Outdated
Show resolved
Hide resolved
Code review notes
Scenario notesIs the goal to improve the performance of the [real-world?] AntiForgery benchmark or to improve the performance of DataProtection in a standalone benchmark? The PR description (and attached graph) make it sound like improving the performance of the crank-based benchmark is the goal, but no throughput measurement is provided for the changes in this PR. Please provide that graph. It would supply evidence that these changes have real-world impact and aren't just microbenchmark improvements. |
Thanks for detailed answer @GrabYourPitchforks! Firstly, I ran the Antiforgery benchmark multiple times, and I provided the results in the PR description. Re №3: I ran a BenchmarkDotNet for stackalloc with dynamic \ constant length of stackalloc (also with or without Re №2: Thanks for clarifying it, I will create issues on the dotnet/runtime explaining what API I would like to have to make DataProtection's flow dont use Re №1: Could you please describe how attack surface of the application is increased if pool buffers are used? Does that mean that pooling is easier to inject into via reflection for example? Actually, even if we will not be using pooling byte arrays, if I work with corelib to introduce APIs supporting
|
Looks like this PR hasn't been active for some time and the codebase could have been changed in the meantime. |
The goal of PR is to bring linux performance closer to windows performance for
DataProtection
scenario. Below is the picture of Antiforgery benchmarks on win vs lin machines.Results: DataProtection Benchmark
The benchmark I am relying on to show the result numbers is here, which is basically building default
ServiceProvider
, adding DataProtection via.AddDataProtection()
and callingIDataProtector.Protect()
orIDataProtector.Unprotect()
.Results: Antiforgery Benchmark
However, since we are originally looking at improving the Antiforgery performance on linux, I ran the Antiforgery benchmark including locally built dll's from this PR.
current aspnet core gives these stats on the benchmark:
RPS of run with changed dll's varies from run to run, therefore I ran it 10 times
But memory usage is stable with such values:
Which provide an evidence of ~10% of max app allocation size and ~5% RPS improvement.
Optimization details
I looked into
Unprotect
method forManagedAuthenticatedEncryptor
and spottedMemoryStream
usage and multipleBuffer.BlockCopy
usages. Also I saw that there is some shuffling ofbyte[]
data, which I think can be skipped and performed in such a way, that some allocations are skipped.In order to be as safe as possible, I created a separate
DataProtectionPool
which provides API to rent and return byte arrays. It is not intersecting withArrayPool<byte>.Shared
.ManagedSP800_108_CTR_HMACSHA512.DeriveKeys
is changed to explicit usageManagedSP800_108_CTR_HMACSHA512.DeriveKeysHMACSHA512
, because _kdkPrfFactory is anyway hardcoded to useHMACSHA512
. There is a static API allowing to hash without allocating kdkbyte[]
which is rented from the buffer:HMACSHA512.TryHashData(kdk, prfInput, prfOutput, out _);
Avoided usage of
DeriveKeysWithContextHeader
which allocates a separate intermediate array forcontextHeader
andcontext
. Instead passing the spansoperationSubkey
andvalidationSubkey
directly intoManagedSP800_108_CTR_HMACSHA512.DeriveKeys
ManagedSP800_108_CTR_HMACSHA512.DeriveKeysHMACSHA512
had 2 more arrays (prfInput
andprfOutput
), which now I am renting (viaDataProtectionPool
) or evenstackalloc
'ing. They are returned to the pool withclearArray: true
flag to make sure key material is removed from the memory after usage.In
Decrypt()
flow I am again using HashAlgorithm.TryComputeHash overload, which works based on theSpan<byte>
types, compared to previously used HashAlgorithm.ComputeHashIn
Decrypt()
flow changed usage to SymmetricAlgorithm.DecryptCbc() instead of CryptoTransform.TransformBlock() with same idea to useSpan<byte>
API instead of anotherbyte[]
allocation.Encrypt()
flow is reusing №1, №2 and №3 optimizations as wellEncrypt()
before was relying on theMemoryStream
andCryptoStream
to write data in the result buffer, but I am pre-calculating the length, and then doing a single allocation of result array:var outputArray = new byte[keyModifierLength + ivLength + cipherTextLength + macLength];
All required data is copied into the outputArray via APIs supportingSpan<byte>
.All listed optimizations are included in the
net10
TFM, but only some (№ 2, №3 and №6) are used innetstandard2.0
andnetFx
TFMs which DataProtection also targets.Related to #59287