You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In #125@ibnesayeed mentioned that a page is replayed more often than archived. The same content at different URI-Rs should yield the same IPFS hash when pushed. If chunking is different on different servers, this will not be the case if the chunk lengths are considered as part of the payload an used as part of the basis for the hash.
The logic to decode chunked responses has been implemented in replay.py. Move and adapt this implementation for chunked payloads prior to pushing to IPFS. Use the dechunked payload as the basis for the IPFS hash when writing the CDXJ.
The text was updated successfully, but these errors were encountered:
machawk1
changed the title
Decode Chunked Transfer encoded payload prior to pushing to IPFS instead of decoding at recplay
Decode Chunked Transfer encoded payload prior to pushing to IPFS instead of decoding at replay
Mar 2, 2017
@ibnesayeed I referred to #125 so your rationale will persist. Part of the first ¶ was what I considered when we first spoke of implementing it in replay.
In #125 @ibnesayeed mentioned that a page is replayed more often than archived. The same content at different URI-Rs should yield the same IPFS hash when pushed. If chunking is different on different servers, this will not be the case if the chunk lengths are considered as part of the payload an used as part of the basis for the hash.
The logic to decode chunked responses has been implemented in
replay.py
. Move and adapt this implementation for chunked payloads prior to pushing to IPFS. Use the dechunked payload as the basis for the IPFS hash when writing the CDXJ.The text was updated successfully, but these errors were encountered: