-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of decoding #62
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## main #62 +/- ##
==========================================
- Coverage 94.55% 94.11% -0.45%
==========================================
Files 4 4
Lines 147 153 +6
==========================================
+ Hits 139 144 +5
- Misses 8 9 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good :)
|
||
// Lookup table for ascii to hex decoding. | ||
#[rustfmt::skip] | ||
static DECODE_TABLE: [u8; 256] = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this static
and not const
?
static DECODE_TABLE: [u8; 256] = [ | |
const DECODE_TABLE: [u8; 256] = [ |
If this table were const
, one could make the val
function const
: playground
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lookup table is relatively large, so using const may bloat the binary size by inlining.
IIRC, the standard library uses static for that use-case. For example: https://github.com/rust-lang/rust/blob/1.54.0/library/core/src/unicode/unicode_data.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, benchmarking (on my machine) shows that changing this to const does not change the performance.
index: idx + 1, | ||
}); | ||
} | ||
Ok((upper << 4) | lower) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this bitshift + or do?
Ok((upper << 4) | lower) | |
// upper and lower are only 4 bits large, so of the 8 bits only the first 4 are used. | |
// this merges the two 4 bit numbers into one 8 bit number: | |
// | |
// upper: 0 0 0 0 U U U U | |
// lower: 0 0 0 0 L L L L | |
// result: U U U U L L L L | |
Ok((upper << 4) | lower) |
(does not hurt to explain?)
]; | ||
|
||
#[inline] | ||
fn val(bytes: &[u8], idx: usize) -> Result<u8, FromHexError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn val(bytes: &[u8], idx: usize) -> Result<u8, FromHexError> { | |
const fn val(bytes: &[u8], idx: usize) -> Result<u8, FromHexError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new implementation of val function cannot be const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the table were const
and not static
, this function could be constant: playground
|
||
use core::iter; | ||
use core::{iter, u8}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used u8 module because the u8::MAX associated constant requires a relatively new compiler.
See also: #55
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have a point. Personally, I do not really care about MSRV (because I want to use the new features and updating the compiler is just a simple rustup update
) so I did not think much about it.
This change makes decoding about 10x faster than the current implementation, on my machine.
Before:
After:
Benchmarks per commit
Current main branch (aa8f300)
First commit of this PR: Use lookup table for decoding (870590e)
Second commit of this PR: Inline val function (3b75b32)
Third commit of this PR: Use chunks_exact instead of chunks (ff1d115)
Fourth commit of this PR: Use decode_to_slice in Vec::from_hex (4fd0ed8)
Fifth commit of this PR: Inline decode_to_slice function (3e2aa25)