Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generator implementation for extractResponseFromChunkedData #129

Open
ibnesayeed opened this issue Mar 6, 2017 · 2 comments
Open

Generator implementation for extractResponseFromChunkedData #129

ibnesayeed opened this issue Mar 6, 2017 · 2 comments

Comments

@ibnesayeed
Copy link
Member

I think we got a use case to implement a generator function to extract chunks from chunked encoded payload. This will enable iteration over chunks transparently. May be we want to revisit this when we move dechunking on indexing time.

@machawk1
Copy link
Member

machawk1 commented Mar 8, 2017

@ibnesayeed You are correct in that a generator probably would have been a better way to accomplish this. The current implementation (still in the indexer per #126) is fairly succinct.

https://github.com/oduwsdl/ipwb/blob/master/ipwb/replay.py#L268-L284

When we do move to using a generator, we ought to benchmark this implementation and the new one. Further, to reinforce your previous points, the dechunking code ought to exist in a script independent of both the replay and indexing code and reused as needed. A potential use case for this resides in still-chunked responses already existing in IPFS (through some other means than IPWB, perhaps).

@ibnesayeed
Copy link
Member Author

We can perhaps utilize https://github.com/webrecorder/warcio to get decunked payload as buffered stream directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants