Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: Store packed block in signed_block #1148

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

heifner
Copy link
Member

@heifner heifner commented Feb 6, 2025

Cache the packed block data in signed_block so that it can be retrieved without having to pack it at time of serialization.

Resolves #1062

…ock was the correct block id. Not used, so just removed it.
…block so it does not need to be re-packed when needed for the block log or for P2P. Refactor signed_block construction so that it is harder to use incorrectly. It is necessary to always store the packed block in case it is needed. Specialize signed_block unpack so that it always fills in the packed_block of signed_block.
@heifner heifner added the OCI Work exclusive to OCI team label Feb 6, 2025
@heifner heifner requested review from greg7mdp and linh2931 February 6, 2025 13:00
@@ -85,21 +85,26 @@ namespace eosio { namespace chain {

using block_extension = block_extension_types::block_extension_t;

using signed_block_ptr = std::shared_ptr<const signed_block>;
using mutable_block_ptr = std::unique_ptr<signed_block>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not keep the original name mutable_signed_block_ptr? That's more clearer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in most cases it is not "signed" yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment? We have other cases where mutable and non-mutable use the same root like mutale_db() and db().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

template <typename Stream>
void unpack(Stream& s, eosio::chain::signed_block& v) {
try {
if constexpr (requires { s.extract_mirror(); }) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't quite get this code. Why the if constexpr check? Why the 4096 below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because if signed_block is being unpacked with a datastream_mirror then we can use the extract_mirror() to populate signed_block::packed_block. If a different Stream type is being used, then wrap it in a datastream_mirror to capture the packed block data. The 4096 is used to reserve a size in the vector; open to removing the + 4096 if we think it might be wasteful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I got confused by the method name is unpack but then it sets packed_block. Add a comment about 4096.

@@ -1590,7 +1578,7 @@ struct controller_impl {

// blog.append could fail due to failures like running out of space.
// Do it before commit so that in case it throws, DB can be rolled back.
blog.append( (*bitr)->block, (*bitr)->id(), irreversible_mode() ? f.get() : it++->get() );
blog.append( (*bitr)->block, (*bitr)->id(), (*bitr)->block->packed_signed_block() );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it is cleaner.

@@ -129,3 +142,20 @@ FC_REFLECT_DERIVED(eosio::chain::transaction_receipt, (eosio::chain::transaction
FC_REFLECT(eosio::chain::additional_block_signatures_extension, (signatures));
FC_REFLECT(eosio::chain::quorum_certificate_extension, (qc));
FC_REFLECT_DERIVED(eosio::chain::signed_block, (eosio::chain::signed_block_header), (transactions)(block_extensions) )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not put this code some where in fc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow. signed_block is defined in libraries/chain which depends on libraries/libfc.
I did try to move the forward declare out of fc, but couldn't get it to work without the fc forward declare.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant unpack(Stream& s, eosio::chain::signed_block& v) is defined in raw_fwd.hpp, why not put the implementation there. I think about it more. It is also OK to remain here.

@@ -36,7 +36,7 @@ struct block_log_fixture {
}
else {
eosio::chain::genesis_state gs;
log->reset(gs, std::make_shared<eosio::chain::signed_block>());
log->reset(gs, eosio::chain::signed_block::create_signed_block(eosio::chain::signed_block::create_mutable_block({})));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this complicated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to make it difficult to misuse signed_block. We want packed_block data of signed_block to always be populated. Could create a method of signed_block that just does this, but since that would only be useful in tests I didn't.

return read_block(*ds, block_num);
uint64_t block_size = 0;

auto ds = catalog.ro_stream_and_size_for_block(block_num, block_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not from your changes. I wish ro_stream_and_size_for_block returned ds and size explicitly.

@ericpassmore
Copy link
Contributor

Note:start
category: Other
component: Internal
summary: Cache the packed block data in signed_block so that it can be retrieved without having to pack it at time of serialization.
Note:end

Copy link
Contributor

@greg7mdp greg7mdp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I know why adding the:
namespace eosio::chain { struct signed_block; }

at the top of block.hpp didn't work. It is because some files like block_state_legacy.cpp include block_state_legacy.hpp first, and then block_state_legacy.hpp includes block_header_state_legacy.hpp before block.hpp.

So for files like this the raw_fwd.hpp is included before the forward delaration of signed_block is seen.

Comment on lines +105 to +106
static mutable_block_ptr create_mutable_block(const signed_block_header& h) { return std::unique_ptr<signed_block>(new signed_block(h)); }
static signed_block_ptr create_signed_block(mutable_block_ptr&& b) { b->pack(); return signed_block_ptr{std::move(b)}; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why these should be inside struct signed_block. even if pack() has tobe made public.

It is also a little strange to me that bytes packed_block; is inside signed_block? Why not have a separate type packed_signed_block? Maybe it would require more changes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are constructor like methods. Seems like they belong in signed_block, also allows pack() to be private without adding friends of the external functions.

We emit/signal signed_block out to plugins. signed_block is a well known and used type. I see this as an internal cached optimization to make packed_signed_block() efficient. Creating a different type means that we have to pass around a packed_signed_block everywhere we use signed_block today.

bool read( char* d, size_t s ) {
if (ds.read(d, s)) {
auto size = mirror.size();
mirror.resize(size + s);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is potentially very inefficient. I suggest mirror.resize(std::bit_ceil(size + s));

template<typename Stream> friend void fc::raw::unpack(Stream& s, eosio::chain::signed_block& v);
void pack() { packed_block = fc::raw::pack( *this ); }

bytes packed_block; // packed this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a std::future<bytes>, so we can post the packed_block = fc::raw::pack( *this ); to a thread pool?

fc::reflector<eosio::chain::signed_block>::visit( fc::raw::detail::unpack_object_visitor<Stream, eosio::chain::signed_block>( v, s ) );
v.packed_block = s.extract_mirror();
} else {
fc::datastream_mirror<Stream> ds(s, sizeof(eosio::chain::signed_block) + 4096);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the sizeof(eosio::chain::signed_block) is very significant since most of the memory used is likely in the deque<transaction_receipt> transactions; whose sizeof() is probably around 16. I think I would just simplify to just 4096, but it probably doesn't matter much and is fine as is.

if (pos != block_log::npos) {
block_file.seek(pos);
return read_block(block_file, block_num);
fc::datastream_mirror ds(block_file, size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, here we dimension the mirror vector to exactly the right size.

auto mb_ds = pending_message_buffer.create_datastream();
fc::raw::unpack( mb_ds, which );

fc::datastream_mirror ds(mb_ds, message_length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, good dimensioning of the vector here as well. I see that my suggestion on the resize() was not as critical as I originally thought.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCI Work exclusive to OCI team performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance: Store packed block in block_state
4 participants