Decode Chunked Transfer encoded payload prior to pushing to IPFS instead of decoding at replay #126

machawk1 · 2017-03-02T17:36:49Z

In #125 @ibnesayeed mentioned that a page is replayed more often than archived. The same content at different URI-Rs should yield the same IPFS hash when pushed. If chunking is different on different servers, this will not be the case if the chunk lengths are considered as part of the payload an used as part of the basis for the hash.

The logic to decode chunked responses has been implemented in replay.py. Move and adapt this implementation for chunked payloads prior to pushing to IPFS. Use the dechunked payload as the basis for the IPFS hash when writing the CDXJ.

The text was updated successfully, but these errors were encountered:

ibnesayeed · 2017-03-02T17:40:55Z

@machawk1 you misinterpreted some of the arguments I put. Despite, I think dechunking at storage time would be more beneficial.

machawk1 · 2017-03-02T18:05:59Z

@ibnesayeed I referred to #125 so your rationale will persist. Part of the first ¶ was what I considered when we first spoke of implementing it in replay.

machawk1 mentioned this issue Mar 2, 2017

Account for chunk-extension values when parsing chunked responses #125

Closed

machawk1 changed the title ~~Decode Chunked Transfer encoded payload prior to pushing to IPFS instead of decoding at recplay~~ Decode Chunked Transfer encoded payload prior to pushing to IPFS instead of decoding at replay Mar 2, 2017

machawk1 added enhancement ipwb indexer ipwb replay labels Mar 2, 2017

machawk1 added this to the 2.0 (Extended more featureful implementation) milestone Mar 2, 2017

machawk1 mentioned this issue Mar 8, 2017

Generator implementation for extractResponseFromChunkedData #129

Open

machawk1 modified the milestones: 2.0 (Extended more featureful implementation), 1.0β Mar 13, 2017

machawk1 mentioned this issue Oct 19, 2018

Perform dechunking at indexing time #592

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decode Chunked Transfer encoded payload prior to pushing to IPFS instead of decoding at replay #126

Decode Chunked Transfer encoded payload prior to pushing to IPFS instead of decoding at replay #126

machawk1 commented Mar 2, 2017

ibnesayeed commented Mar 2, 2017

machawk1 commented Mar 2, 2017

Decode Chunked Transfer encoded payload prior to pushing to IPFS instead of decoding at replay #126

Decode Chunked Transfer encoded payload prior to pushing to IPFS instead of decoding at replay #126

Comments

machawk1 commented Mar 2, 2017

ibnesayeed commented Mar 2, 2017

machawk1 commented Mar 2, 2017