Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of decoding #62

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Conversation

taiki-e
Copy link

@taiki-e taiki-e commented Jul 18, 2021

This change makes decoding about 10x faster than the current implementation, on my machine.

Before:

hex_decode              time:   [128.28 us 129.41 us 130.70 us]

After:

hex_decode              time:   [12.206 us 12.242 us 12.284 us]                        
Benchmarks per commit

Current main branch (aa8f300)

hex_decode              time:   [128.28 us 129.41 us 130.70 us]

First commit of this PR: Use lookup table for decoding (870590e)

hex_decode              time:   [63.538 us 63.936 us 64.419 us]
                        change: [-50.833% -50.462% -50.088%] (p = 0.00 < 0.05)
                        Performance has improved.

Second commit of this PR: Inline val function (3b75b32)

hex_decode              time:   [32.263 us 32.382 us 32.521 us]
                        change: [-49.441% -49.098% -48.682%] (p = 0.00 < 0.05)
                        Performance has improved.

Third commit of this PR: Use chunks_exact instead of chunks (ff1d115)

hex_decode              time:   [24.701 us 24.814 us 24.936 us]
                        change: [-24.459% -23.911% -23.395%] (p = 0.00 < 0.05)
                        Performance has improved.

Fourth commit of this PR: Use decode_to_slice in Vec::from_hex (4fd0ed8)

hex_decode              time:   [16.060 us 16.297 us 16.608 us]
                        change: [-35.137% -34.468% -33.556%] (p = 0.00 < 0.05)
                        Performance has improved.

Fifth commit of this PR: Inline decode_to_slice function (3e2aa25)

hex_decode              time:   [12.206 us 12.242 us 12.284 us]                        
                        change: [-21.954% -20.254% -18.150%] (p = 0.00 < 0.05)
                        Performance has improved.

@codecov-commenter
Copy link

codecov-commenter commented Jul 18, 2021

Codecov Report

Merging #62 (f3251bd) into main (aa8f300) will decrease coverage by 0.44%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #62      +/-   ##
==========================================
- Coverage   94.55%   94.11%   -0.45%     
==========================================
  Files           4        4              
  Lines         147      153       +6     
==========================================
+ Hits          139      144       +5     
- Misses          8        9       +1     
Impacted Files Coverage Δ
tests/version-number.rs 100.00% <ø> (ø)
src/lib.rs 96.99% <100.00%> (-0.65%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aa8f300...f3251bd. Read the comment docs.

Copy link
Contributor

@Luro02 Luro02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good :)


// Lookup table for ascii to hex decoding.
#[rustfmt::skip]
static DECODE_TABLE: [u8; 256] = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this static and not const?

Suggested change
static DECODE_TABLE: [u8; 256] = [
const DECODE_TABLE: [u8; 256] = [

If this table were const, one could make the val function const: playground

Copy link
Author

@taiki-e taiki-e Aug 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lookup table is relatively large, so using const may bloat the binary size by inlining.

IIRC, the standard library uses static for that use-case. For example: https://github.com/rust-lang/rust/blob/1.54.0/library/core/src/unicode/unicode_data.rs

See also https://stackoverflow.com/questions/52751597/what-is-the-difference-between-a-constant-and-a-static-variable-and-which-should

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, benchmarking (on my machine) shows that changing this to const does not change the performance.

index: idx + 1,
});
}
Ok((upper << 4) | lower)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this bitshift + or do?

Suggested change
Ok((upper << 4) | lower)
// upper and lower are only 4 bits large, so of the 8 bits only the first 4 are used.
// this merges the two 4 bit numbers into one 8 bit number:
//
// upper: 0 0 0 0 U U U U
// lower: 0 0 0 0 L L L L
// result: U U U U L L L L
Ok((upper << 4) | lower)

(does not hurt to explain?)

src/lib.rs Show resolved Hide resolved
];

#[inline]
fn val(bytes: &[u8], idx: usize) -> Result<u8, FromHexError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn val(bytes: &[u8], idx: usize) -> Result<u8, FromHexError> {
const fn val(bytes: &[u8], idx: usize) -> Result<u8, FromHexError> {

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new implementation of val function cannot be const

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the table were const and not static, this function could be constant: playground


use core::iter;
use core::{iter, u8};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm I do not think that you should import the u8 module, where u8::MAX is deprecated.

The u8 type itself has an associated constant u8::MAX.

Suggested change
use core::{iter, u8};
use core::iter;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used u8 module because the u8::MAX associated constant requires a relatively new compiler.

See also: #55

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a point. Personally, I do not really care about MSRV (because I want to use the new features and updating the compiler is just a simple rustup update) so I did not think much about it.

#58 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants