-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large percentage of cpu time in memset when reading with a larger buffer #55
Comments
I believe this is because the https://github.com/rustls/tokio-rustls/blob/main/src/common/mod.rs#L218 We can do some optimizations on |
This problem can also be fixed if you change |
We could limit zero-initialization to the maximum TLS record size. On modern CPUs, a 16KB memset should not have a noticeable impact. |
A possible solution could be the following. I think this is sound, since I'm profiling this on an aws m6in.8xlarge instance, downloading from s3. In the flamegraph the The call stack is coming from hyper h1 diff --git a/src/common/mod.rs b/src/common/mod.rs
index fde34c0..a9e3115 100644
--- a/src/common/mod.rs
+++ b/src/common/mod.rs
@@ -248,6 +248,10 @@ where
}
}
+ // Safety: We trust `read` to only write initialized bytes to the slice and never read from it.
+ unsafe {
+ buf.assume_init(buf.remaining());
+ }
match self.session.reader().read(buf.initialize_unfilled()) {
// If Rustls returns `Ok(0)` (while `buf` is non-empty), the peer closed the
// connection with a `CloseNotify` message and no more data will be forthcoming.
|
@seanmonstar do you think it's feasible to avoid recreating the |
I suppose theoretically, but realistically at the moment, that |
I think best way for now, apart from stabilize like pub trait ReadBuf {
fn append(&mut self, buf: &[u8]);
}
pub fn read_buf(&mut self, buf: &mut dyn ReadBuf) {
//
} |
rustls can be asked how much data it would write into an infinite size buffer provided to Footnotes
|
I am considering using unbuffered api refactor, which would also solve the problem. |
Reproduced using a slightly modified version of
examples/client.rs
to allow specifying a path, and reading into a user-provided buffer. The problem was originally noticed viarusoto
s3 client, which has a configuration for a read buffer size, which gets mapped tohyper
http1_read_buf_exact_size
. Since aws s3 recommends fetching large ranges, using a larger buffer seemed like a good idea and was not expected to cause any cpu overhead.The text was updated successfully, but these errors were encountered: