Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CompressionStream supported in all browsers #1

Closed
BenjaminAster opened this issue May 16, 2023 · 15 comments
Closed

CompressionStream supported in all browsers #1

BenjaminAster opened this issue May 16, 2023 · 15 comments
Assignees
Labels
bug Something isn't working question Further information is requested

Comments

@BenjaminAster
Copy link

Hi again,
I just saw you forking the CompressionStream polyfill, so I thought you might have missed the news that CompressionStream is now already supported by all three browser engines, including Gecko (Firefox) and WebKit (Safari) (see also https://caniuse.com/mdn-api_compressionstream).

@Offroaders123 Offroaders123 self-assigned this May 16, 2023
@Offroaders123
Copy link
Owner

Oh, yeah I did see that, super awesome! The only thing is that deflate-raw isn't supported in Node as of now. And I guess that some browsers an update or few back might not have it quite yet. It probably won't be anything to worry about in 6 months or so though, that'll be great.

I do wish there were async functions to go along with the stream-based parts of the API too. Then it would be simpler to use with Uint8Array and ArrayBuffer objects directly, without having to convert them to streams and back every time. It could probably do that under the hood either way, but it does seem a bit cumbersome to need to convert it using either a Blob constructor, or a Response object. Any suggestions on how you would do that? I've just been using my own wrapper async functions to do that. It is a minor detail though, so I guess it's not too bad to have each of these in every project that does compression? Still feels a bit awkward for me though.

Sorry for the rant, guess I'm just not sure how I should manage complete feature parity, while also using modern APIs.

After thinking about it, I guess it's no different than having to make your own file picker function wrapper around things like the <input type="file"> element, and the File System Access API when it's present. That also feels like it is in that compatibility gray area, where it has to work either way, but you have to do some work to make it streamlined.

Oh yeah, and that TypeScript doesn't have support for it yet either. Have you heard any news as to when this API will become part of the standard library type definitions?

Sorry this is so long, didn't mean for that. Guess I've been unsure with how to properly set this all up here, at least what the 'best' way to do it should be. I'm curious if there are any performance differences between the two methods I've tried out here?

// My first version

export type CompressionFormat = "gzip" | "deflate" | "deflate-raw";

export interface CompressionOptions {
  format?: CompressionFormat;
}

/**
 * Transforms a Uint8Array through a specific compression format. If a format is not provided, the gzip format will be used.
*/
export async function compress(data: Uint8Array, { format = "gzip" }: CompressionOptions = {}){
  const stream = new CompressionStream(format);
  const readable = new Blob([data]).stream().pipeThrough(stream);
  const buffer = await new Response(readable).arrayBuffer();
  return new Uint8Array(buffer);
}

/**
 * Transforms a Uint8Array through a specific decompression format. If a format is not provided, the gzip format will be used.
*/
export async function decompress(data: Uint8Array, { format = "gzip" }: CompressionOptions = {}){
  const stream = new DecompressionStream(format);
  const readable = new Blob([data]).stream().pipeThrough(stream);
  const buffer = await new Response(readable).arrayBuffer();
  return new Uint8Array(buffer);
}
// My second version
// I think I changed my mind on 'gzip' being the default compression format.

export type CompressionFormat = "gzip" | "deflate" | "deflate-raw";

export interface CompressionOptions {
  format: CompressionFormat;
}

/**
 * Compresses a Uint8Array using a specific compression format.
*/
export async function compress(data: Uint8Array | ArrayBufferLike, { format }: CompressionOptions): Promise<Uint8Array> {
  const { body } = new Response(data instanceof Uint8Array ? data : new Uint8Array(data));
  const readable = body!.pipeThrough(new CompressionStream(format));
  const buffer = await new Response(readable).arrayBuffer();
  return new Uint8Array(buffer);
}

/**
 * Decompresses a Uint8Array using a specific decompression format.
*/
export async function decompress(data: Uint8Array | ArrayBufferLike, { format }: CompressionOptions): Promise<Uint8Array> {
  const { body } = new Response(data instanceof Uint8Array ? data : new Uint8Array(data));
  const readable = body!.pipeThrough(new DecompressionStream(format));
  const buffer = await new Response(readable).arrayBuffer();
  return new Uint8Array(buffer);
}

@Offroaders123 Offroaders123 added the question Further information is requested label May 16, 2023
@Offroaders123
Copy link
Owner

Yeah, this is my main concern for it. While it is supported in the most recent versions, still not quite everyone has moved up to the latest browser versions yet.

CompressionStream API - Caniuse

image

@BenjaminAster
Copy link
Author

There is currently no straight-forward method to compress raw binary data without all the stream stuff which I guess does make sense as network stuff is the place where compression is usually used and needed, but yeah, I would wish one for myself, especially to be able to read and write .zip, .docx, ... files for a possible future rich-text editor project of mine (see also @ndesmic's article about that). Such an async function has been suggested in e.g. whatwg/compression#8; it might maybe become available in the future? As for ways to implement your own wrapper function, using Responses and .pipeThrough() seems to be inefficient; the spec suggests an example on how to do that in a more performant way using writers and readers.

Regarding the TypeScript definitions, CompressionStreams will be in v5.1 (you can already try it out with its beta version), in the meantime you can use my new-javascript type definitions where they're also included.

@Offroaders123
Copy link
Owner

Ok, I decided to make a comparison for these two, and it came up with some interesting results!
It would probably also make sense to add an average for all of them; I only log out each result individually, here.

You can save this locally as Compression-Streams-Speed.mts, then call npx tsx ./Compression-Streams-Speed.mts to run it in Node.

// clear; npx tsx ./Compression-Streams-Speed.mts

const DATA = new Uint8Array(Array.from({ length: 0x1000 },() => Math.floor(Math.random() * 10)));
console.log(DATA,"\n");

const BLOB = new Blob([DATA]);

const TEST_REPEATS = 0x1000;

for (let i = 0; i < TEST_REPEATS; i++){
  const COMPRESSED_DATA =
  await timer(`#${i} ArrayBuffer Compress  `,async () => compressArrayBuffer(DATA,"deflate"));
  await timer(`#${i} ArrayBuffer Decompress`,async () => decompressArrayBuffer(COMPRESSED_DATA,"deflate"));

  const COMPRESSED_BLOB =
  await timer(`#${i} Blob Compress         `,async () => compressBlob(BLOB,"deflate"));
  await timer(`#${i} Blob Decompress       `,async () => decompressBlob(COMPRESSED_BLOB,"deflate"));

  console.log();
}

async function timer<T>(label: string, callback: () => Promise<T>){
  console.time(label);
  const result: T = await callback();
  console.timeEnd(label);
  return result;
}

// 8.2. Deflate-compress an ArrayBuffer to a Uint8Array

async function compressArrayBuffer(input: BufferSource, format: CompressionFormat){
  const cs = new CompressionStream(format);
  const writer = cs.writable.getWriter();
  writer.write(input);
  writer.close();
  const output: Uint8Array[] = [];
  const reader = cs.readable.getReader();
  let totalSize = 0;
  while (true){
    const { done, value } = await reader.read();
    if (done) break;
    output.push(value);
    totalSize += value.byteLength;
  }
  const concatenated = new Uint8Array(totalSize);
  let offset = 0;
  for (const array of output){
    concatenated.set(array,offset);
    offset += array.byteLength;
  }
  return concatenated;
}

// Demo: decompress ArrayBuffer

async function decompressArrayBuffer(input: BufferSource, format: CompressionFormat){
  const ds = new DecompressionStream(format);
  const writer = ds.writable.getWriter();
  writer.write(input);
  writer.close();
  const output: Uint8Array[] = [];
  const reader = ds.readable.getReader();
  let totalSize = 0;
  while (true){
    const { done, value } = await reader.read();
    if (done) break;
    output.push(value);
    totalSize += value.byteLength;
  }
  const concatenated = new Uint8Array(totalSize);
  let offset = 0;
  for (const array of output){
    concatenated.set(array,offset);
    offset += array.byteLength;
  }
  return concatenated;
}

// 8.3. Gzip-decompress a Blob to Blob

async function decompressBlob(blob: Blob, format: CompressionFormat){
  const ds = new DecompressionStream(format);
  const decompressionStream = blob.stream().pipeThrough(ds);
  return new Response(decompressionStream).blob();
}

// Demo: compress Blob

async function compressBlob(blob: Blob, format: CompressionFormat){
  const cs = new CompressionStream(format);
  const compressionStream = blob.stream().pipeThrough(cs);
  return new Response(compressionStream).blob();
}

declare global {
  type CompressionFormat = "deflate" | "deflate-raw" | "gzip";
  class CompressionStream extends TransformStream<BufferSource,Uint8Array> {
    constructor(format: CompressionFormat);
  }
  class DecompressionStream extends TransformStream<BufferSource,Uint8Array> {
    constructor(format: CompressionFormat);
  }
}

export {};

@Offroaders123
Copy link
Owner

Thanks for letting me know about the performance differences, I had no idea that was the case! I will look into using the ArrayBuffer loading setup instead of passing it into a Blob constructor like I'm currently doing.

Offroaders123 added a commit to Offroaders123/NBTify that referenced this issue May 17, 2023
Did more investigating into multiple ways of using the Compression Streams API, and I realized that this is a really nice setup to have, possibly for a standalone module outside of NBTify. This provides a more cross-platform zlib API which would be similar to using `node:zlib`, but available both in the browser, and Node! It's build off of the Compression Streams API too, so it's only an API wrapper, and possibly a polyfill where necessary.

I'm going to try settings things up first here, then move them to a dedicated project which both handles the compression polyfills, and provides these more functional-based APIs, which both provide more performance than my previous streams-specific `Blob` handling that I had before, and are simpler to use directly with `ArrayBuffer` and the various `TypedArray` objects, rather than having to go back and forth between `Blob` containers, which does appear to be a little less performant with the simple test that I set up.

Oh my, that last sentence was horrible, I'm sorry. I guess I'm more of a dev than I am a writer.

Offroaders123/Dovetail#1
whatwg/compression#8
https://wicg.github.io/compression/#example-deflate-compress
https://jakearchibald.com/2017/async-iterators-and-generators/#making-streams-iterate
Offroaders123 added a commit to Offroaders123/compression-streams-polyfill that referenced this issue May 17, 2023
Brought these ideas over from my NBTify demos.

Not sure what the callback inside of the polyfill was for, but turns out it was necessary to be able to run! Re-added that back in to fix it.

Offroaders123/Dovetail#1
Offroaders123 added a commit to Offroaders123/NBTify that referenced this issue May 19, 2023
Did more investigating into multiple ways of using the Compression Streams API, and I realized that this is a really nice setup to have, possibly for a standalone module outside of NBTify. This provides a more cross-platform zlib API which would be similar to using `node:zlib`, but available both in the browser, and Node! It's build off of the Compression Streams API too, so it's only an API wrapper, and possibly a polyfill where necessary.

I'm going to try settings things up first here, then move them to a dedicated project which both handles the compression polyfills, and provides these more functional-based APIs, which both provide more performance than my previous streams-specific `Blob` handling that I had before, and are simpler to use directly with `ArrayBuffer` and the various `TypedArray` objects, rather than having to go back and forth between `Blob` containers, which does appear to be a little less performant with the simple test that I set up.

Oh my, that last sentence was horrible, I'm sorry. I guess I'm more of a dev than I am a writer.

Offroaders123/Dovetail#1
whatwg/compression#8
https://wicg.github.io/compression/#example-deflate-compress
https://jakearchibald.com/2017/async-iterators-and-generators/#making-streams-iterate
Offroaders123 added a commit to Offroaders123/NBTify that referenced this issue Jun 4, 2023
TypeScript added type definitions for the Compression Streams API, yay!!!

As of when committing this, VSCode's TypeScript version doesn't updated to support this yet, so you may have to specify the editor lib version using a config file, which can be set using the Command Palette to change the TypeScript version to use that of the workspace. Otherwise, placing this in `./.vscode/settings.json` will do the same thing.

```json
{
  "typescript.tsdk": "node_modules/typescript/lib"
}
```

Offroaders123/Dovetail#1
Offroaders123 added a commit to Offroaders123/NBTify that referenced this issue Jun 19, 2023
Fully added support for determining the compression format used by the file, which will now work for all of the formats: `gzip`, `deflate`, and `deflate-raw`!

This might not sound new, because the library has supported explicitly opening all three formats for the last month or so. Now it will be able to automatically open files that are compressed with `deflate-raw`, you don't have to explicitly mention if it is `deflate-raw` or not, to open it successfully.

Been working on adding support for the this the last 4 days or so, and I hit multiple roadblocks along the way.

So firstly, I realized that there isn't as easy of a way to detect a `deflate-raw`-encoded data using the same header technique that I use for the other two formats, because the raw data simply doesn't have a header that you can check against. There might be a way to write some fancy logic to detect it, but I realized that it might be better in terms of being more explicit/reliable to simply just try parsing the data itself, and seeing what happens. While reading across multiple articles online about other people trying to do the same thing, many of the main responses said that just trying to parse the data will be the most efficient, because it will fail fast once something in the data isn't correct.

After thinking about that for a bit, coming back the last few days to try this was the next step. Then I started running into errors while having everything set up. It was seemingly random, in the fact that wrapping the new code in try-catch blocks wouldn't catch where any of the errors were.

While debugging it for about a full day or two, my head was starting to spin and I was getting a bit frustrated. I tried breaking the code out into it's own file to try and debug where the error was, and it started working magically. Then I realized it must be something with the Node `deflate-raw` polyfill I added recently, so I added that in to see if that would break the working code (it didn't). I was really happy that it was working, but I was even moreso concerned that the orignal code wasn't working, having learned everything else that *was* indeed working. Then I had a last attempt idea, maybe it was the `pipeThroughCompressionStream()` function? That wouldn't make any sense. But, I'll try adding that to my standalone code, see what happens.

```ts
const data = Uint8Array.from({ length: 25 },() => 25);
console.log(data);

const compressed = await compress(data,"deflate-raw");
console.log(compressed);

const unzipped = await unzip(compressed);
console.log(unzipped);

async function unzip(data: Uint8Array): Promise<Uint8Array> {
  try {
    return await decompress(data,"gzip");
  } catch {
    try {
      return await decompress(data,"deflate");
    } catch {
      try {
        return await decompress(data,"deflate-raw");
      } catch {
        throw new Error("Could not unzip the buffer data");
      }
    }
  }
}

async function compress(data: Uint8Array, format: CompressionFormat): Promise<Uint8Array> {
  try {
    const compressionStream = new CompressionStream(format);
    return pipeThroughCompressionStream(data,compressionStream);
  } catch (error){
    if (format !== "deflate-raw") throw error;
    // @ts-expect-error
    const { deflateRawSync } = await import("node:zlib");
    return new Uint8Array(deflateRawSync(data));
  }
}

async function decompress(data: Uint8Array, format: CompressionFormat): Promise<Uint8Array> {
  try {
    const decompressionStream = new DecompressionStream(format);
    return pipeThroughCompressionStream(data,decompressionStream);
  } catch (error){
    if (format !== "deflate-raw") throw error;
    // @ts-expect-error
    const { inflateRawSync } = await import("node:zlib");
    return new Uint8Array(inflateRawSync(data));
  }
}

async function pipeThroughCompressionStream(data: Uint8Array, { readable, writable }: CompressionStream | DecompressionStream): Promise<Uint8Array> {
  const writer = writable.getWriter();

  writer.write(data);
  writer.close();

  const chunks: Uint8Array[] = [];
  let byteLength = 0;

  const generator = (Symbol.asyncIterator in readable) ? readable : readableStreamToAsyncGenerator(readable as ReadableStream<Uint8Array>);

  for await (const chunk of generator){
    chunks.push(chunk);
    byteLength += chunk.byteLength;
  }

  const result = new Uint8Array(byteLength);
  let byteOffset = 0;

  for (const chunk of chunks){
    result.set(chunk,byteOffset);
    byteOffset += chunk.byteLength;
  }

  return result;
}

async function* readableStreamToAsyncGenerator<T>(readable: ReadableStream<T>): AsyncGenerator<T,void,void> {
  const reader = readable.getReader();
  try {
    while (true){
      const { done, value } = await reader.read();
      if (done) return;
      yield value;
    }
  } finally {
    reader.releaseLock();
  }
}
```

Bingo! The same error started happening. So for some reason, at least with Chrome/V8 (Node)'s Zlib implementations, you can't decompress a `deflate-raw` stream using my `pipeThroughCompressionStream()` chunker function. That was unexpected! My demo had worked fine just before this, when I had originally set it up with my old `pipeThroughCompressionStream()` implementation, just because I was too lazy to add all of the fancy logic for the chunker function version. Here's that one:

```ts
async function pipeThroughCompressionStream(data: Uint8Array, stream: CompressionStream | DecompressionStream): Promise<Uint8Array> {
  const { body } = new Response(data);
  const readable = body!.pipeThrough(stream);
  const buffer = await new Response(readable).arrayBuffer();
  return new Uint8Array(buffer);
}
```

So yeah, that was an interesting find. And for reference, this is the error code that you get from running this in the console:

```log
Error: incorrect header check
    at __node_internal_genericNodeError (node:internal/errors:865:15)
    at Zlib.zlibOnError [as onerror] (node:zlib:189:17) {
  errno: -3,
  code: 'Z_DATA_ERROR'
}
```

I hopefully would like to delve into this a bit more, and figure out why this is an issue. Maybe I'm missing something inside of the `pipeThroughCompressionStream()` function, and that's all that needs help, it might not be Zlib's issue afterall (or rather, my use of it).

This chunker version of the function has proven from the test we've done here has shown (Offroaders123/Dovetail#1), so ideally I'd like to use that version to decompress the data, but this one will work every time, so I'm going with this old version since it is reliably successful, albeit a little slower (not really, just in the grand scheme of things).

(#11)

I'm also debating as of writing this, that it makes sense to provide both ways of doing this auto-detection, since it is faster and less recursive in terms of format parsing if you check the headers of the files before trying to parse them with `x` format (you already know before trying to open it, so you can just open it with the right one to start with), and it also could just use this old non-chunked version for `deflate-raw` specifically, since it works great for the other two formats. I'm deciding not to worry about that with this initial find/addition/fix/discovery though, as having it working for all features is more important to me than having it work the absolute greatest too. I don't want to overcomplicate things just because it might be faster, I want to demo things out more to instead see if I can fix that chunking issue, and add `deflate-raw` support to it, so there would only have to be one `pipeThroughCompressionStream()` function, not two. And I also like the simplification that this headerless auto-decompression handling provides, so I'm going with that one only for the time being, because it is easier for me to understand at the moment. Once again, eventually when I can find benefits to having both, if it makes the others faster, than I will add both because it could simplify the reading process, and have less format checking to do, for the parts that are explicit in how they are formatted (the compression headers).

I may also go this way with the Bedrock Level header, but that mostly depends on if I go all-in on the auto-reading try-catch behavior for compression too, since they are similar in terms of the header-checking behavior.

So here's a mental summary (for you and myself):

I'm going with absolute-support first, leaving performance to be picked up later. I want to ensure that everything is working up to par, then worry about possible performance gains after that. Ideally, I eventually want to have both. But in the meantime, I want to rely on features that fully work, without any inconsistencies or things that could cause spaghettic code (typo, was neat. keeping that; spaghettic - to be spaghetti-like; use: that code is very spaghettic.).
@Offroaders123
Copy link
Owner

Offroaders123 commented Jun 19, 2023

I recently discovered that the chunking function implementation doesn't appear to work with the deflate-raw format. I wrote some demo code to demonstrate this, it's originally from the commit I mentioned just above this comment, 'Auto Deflate-Raw Support'.

I have a lot more about this find in the original commit message. Two of the main notable parts about this find are:

  • This same code works if you change out the implementation of the pipeThroughCompressionStream() function to that which uses the original new Reponse()-based implementation

  • This error happens both in Node, and in Chrome itself, so it's doesn't have to do with the deflate-raw polyfill I set up for Node using node:zlib. They both return a similarly-worded issue, related to the zlib header of the file.

    zlib Error in Node zlib Error in Chrome

(I'm going to re-open this issue for tracking this a bit more, while it isn't quite related to the original topic of the issue)

// clear; npx tsx ./Auto-Unzip.mts

const data = Uint8Array.from({ length: 25 },() => 25);
console.log(data);

const compressed = await compress(data,"deflate-raw");
console.log(compressed);

const unzipped = await unzip(compressed);
console.log(unzipped);

async function unzip(data: Uint8Array): Promise<Uint8Array> {
  try {
    return await decompress(data,"gzip");
  } catch {
    try {
      return await decompress(data,"deflate");
    } catch {
      try {
        return await decompress(data,"deflate-raw");
      } catch {
        throw new Error("Could not unzip the buffer data");
      }
    }
  }
}

async function compress(data: Uint8Array, format: CompressionFormat): Promise<Uint8Array> {
  try {
    const compressionStream = new CompressionStream(format);
    return pipeThroughCompressionStream(data,compressionStream);
  } catch (error){
    if (format !== "deflate-raw") throw error;
    // @ts-expect-error
    const { deflateRawSync } = await import("node:zlib");
    return new Uint8Array(deflateRawSync(data));
  }
}

async function decompress(data: Uint8Array, format: CompressionFormat): Promise<Uint8Array> {
  try {
    const decompressionStream = new DecompressionStream(format);
    return pipeThroughCompressionStream(data,decompressionStream);
  } catch (error){
    if (format !== "deflate-raw") throw error;
    // @ts-expect-error
    const { inflateRawSync } = await import("node:zlib");
    return new Uint8Array(inflateRawSync(data));
  }
}

// // This original implementation does not cause the error
// async function pipeThroughCompressionStream(data: Uint8Array, stream: CompressionStream | DecompressionStream): Promise<Uint8Array> {
//   const { body } = new Response(data);
//   const readable = body!.pipeThrough(stream);
//   const buffer = await new Response(readable).arrayBuffer();
//   return new Uint8Array(buffer);
// }

async function pipeThroughCompressionStream(data: Uint8Array, { readable, writable }: CompressionStream | DecompressionStream): Promise<Uint8Array> {
  const writer = writable.getWriter();

  writer.write(data);
  writer.close();

  const chunks: Uint8Array[] = [];
  let byteLength = 0;

  const generator = (Symbol.asyncIterator in readable) ? readable : readableStreamToAsyncGenerator(readable as ReadableStream<Uint8Array>);

  for await (const chunk of generator){
    chunks.push(chunk);
    byteLength += chunk.byteLength;
  }

  const result = new Uint8Array(byteLength);
  let byteOffset = 0;

  for (const chunk of chunks){
    result.set(chunk,byteOffset);
    byteOffset += chunk.byteLength;
  }

  return result;
}

async function* readableStreamToAsyncGenerator<T>(readable: ReadableStream<T>): AsyncGenerator<T,void,void> {
  const reader = readable.getReader();
  try {
    while (true){
      const { done, value } = await reader.read();
      if (done) return;
      yield value;
    }
  } finally {
    reader.releaseLock();
  }
}

declare global {
  interface CompressionStream {
    readonly readable: ReadableStream<Uint8Array>;
    readonly writable: WritableStream<BufferSource>;
  }

  interface DecompressionStream {
    readonly readable: ReadableStream<Uint8Array>;
    readonly writable: WritableStream<BufferSource>;
  }

  interface ReadableStream<R> {
    [Symbol.asyncIterator](): AsyncGenerator<R>;
  }
}

@Offroaders123 Offroaders123 reopened this Jun 19, 2023
@Offroaders123
Copy link
Owner

Welp, I think I may have found it! This was a sneaky one for sure.

Turns out calling writer() and close() on a WritableStreamDefaultWriter actually each return Promise-es, so those have to be await-ed! It fixed the issue right away.

WritableStreamDefaultWriter Method Return Types

Offroaders123 added a commit to Offroaders123/NBTify that referenced this issue Jun 19, 2023
Figured out the issue! Yay, epic.

Turns out that `WritableStreamDefaultWriter` methods `write()` and `close()` return `Promise`-es, so you have to `await` them to make sure that the error handling gets bubbled back up to the containing `async` function, hence why the demons were let free, and why I couldn't catch the error with a `try`/`catch` handler. I'm not quite sure yet why it was working for all other cases up until now and others, and maybe the `for await` loop helped make it work or something? eeh, kinda confusing still.

Offroaders123/Dovetail#1

https://developer.mozilla.org/en-US/docs/Web/API/WritableStreamDefaultWriter/write
https://developer.mozilla.org/en-US/docs/Web/API/WritableStreamDefaultWriter/close

This commit only brings back the original chunking function code, and adds the missing `await` calls to `writer.write()` and `writer.close()` to make it work properly now.
@Offroaders123
Copy link
Owner

Hmm, maybe it's not fixed so soon. Having both of the await calls present caused the decompression to stop working in the browser, while it had fixed it in Node. Just discovered this while trying to implement NBTify 1.60.0 into Dovetail, which would cause all compressed files to fail silently in the opening process. I ended up discovering that I think it's because the Promises for one of these doesn't resolve in the browser for some reason, while it does for Node.

@Offroaders123 Offroaders123 reopened this Jun 22, 2023
@Offroaders123
Copy link
Owner

I'm curious if this has to do between the possible difference of underlying Stream implementations between Node and the browser?

This caught my eye over on the WritableStreamDefaultWriter docs.

Note that what "success" means is up to the underlying sink; it might indicate that the chunk has been accepted, and not necessarily that it is safely saved to its ultimate destination.

Does this also mean that it's possible that it won't always resolve either?

@Offroaders123
Copy link
Owner

Ok, this is getting more interesting. It's looking like those might not have been the issue all along? I think it is starting to shape out to looking like it's instead the changes that I made, before I brought back their original implementations in the 'Compression Headers Revert' commit for NBTify.

So adding await to the WritableStreamDefaultWriter method calls introduced this new blocking Promise bug, and the original 'incorrect header check' issue might have actually been because of my, then 'updated', try-catch handling for the various compression formats, rather than the pre-read header check implementations (what going back to will fix).

This seems to be what my tests are showing, proven by if I only update the Read module (that's where the auto-detect compression logic is), the error comes back, and this blocking behavior isn't present in either platform if I remove the await calls from the current release.

This message is starting to feel less clear the more I write it, I'm just going to keep testing, then come back lol.

@Offroaders123
Copy link
Owner

Ok, looks like this is the offending code:

export async function read(data,{ name, endian, compression, bedrockLevel, strict } = {}){

  // Snippet of the function body; incomplete

  if (compression === undefined){
    try {
      return await read(data,{ name, endian, compression: null, bedrockLevel, strict });
    } catch (error){
      try {
        return await read(data,{ name, endian, compression: "gzip", bedrockLevel, strict });
      } catch {
        try {
          return await read(data,{ name, endian, compression: "deflate", bedrockLevel, strict });
        } catch {
          try {
            return await read(data,{ name, endian, compression: "deflate-raw", bedrockLevel, strict });
          } catch {
            throw error;
          }
        }
      }
    }
  }
}

@Offroaders123
Copy link
Owner

Yep, that's it! Yay! Glad it's actually NBTify, and not the chunking code, that makes a lot more sense haha.

Here's the tests for comparison; both are with the await method calls removed:

Test using offending code

Test using offending code

Test without offending code

Test without offending code

Notably in this case here, leaving the offending code, and running in addition to the await calls 'fix' allows the tests to all decompress correctly in Node, while it causes the unresponsive Promises issue in the browser (Hence, I think this is the order it went in to slip past me, too).

@Offroaders123 Offroaders123 added the bug Something isn't working label Jun 22, 2023
@Offroaders123
Copy link
Owner

It would've been ideal to track this in NBTify instead I guess, but that's ok. I thought it had to do with the chunking code originally.

Offroaders123 added a commit to Offroaders123/NBTify that referenced this issue Jun 22, 2023
Wow, this was an interesting one! Turns out my current compression code doesn't work in the browser, discovered this while trying to update to this latest version in Dovetail.

Offroaders123/Dovetail#1

Realized that I made the Bedrock Level header version for `ridiculous.nbt` a signed int, oops.

I really want to find out whether the header is signed or not, my implementation currently goes through signed *and* unsigned for values, so mine isn't very structurally sound, come to think of it.
Offroaders123 added a commit to Offroaders123/NBTify that referenced this issue Jun 30, 2023
Wow, this was an interesting one! I think it may have to do with how the backing implementation of the Compression Streams API is implemented in the browser(s) vs Node. Turns out after all, the error is still partially around the `writer.write()` and `writer.close()` calls.

So, in Node, awaiting both calls works, because they have a `.then()`-able return value, while in Chrome, Safari, and Firefox, they don't ever return when chained with either `.then()`, or used with `await`, hence why it would hang with what would seem like an errorless result. In Node however though, these do eventually resolve, with just a `void` return value.

And interestingly enough, not sure if it has to do with the compression backing implementation, or the handling of Promise rejections and thread sleeping (I don't remember what that's called, but having to do with when the task ends, hiding any errors which haven't rejected by the time the task is completed) (I think it is that second one, I'll explain), but those errors don't occur in Node, but they do in the browser.

I think this is because the console closes before the Promises have rejected, meaning they simply get hidden from failing in the function calls.

I think this can be related to something similar to that of Jake Archibald's article about `for await`, and how it can't properly catch unhandled Promises, as they aren't caught by the current thread pool (not sure if that's the proper term either)

https://jakearchibald.com/2023/unhandled-rejections/

While searching possible fixes to help figure this out, seeing the extra `.catch()` handlers made it click a little more for me, and it help further discover where these ghost errors were bubbling up from!

https://stackoverflow.com/questions/37624322/uncaught-in-promise-undefined-error-when-using-with-location-in-facebook-gra

Overall, I think the main issue is that `writer.write()` and `writer.close()` don't properly have `.then()`-able resolutions in browser(s), while they do in Node. If I could use `await` on these calls, then I could ensure that these errors properly get bubbled back up to the main function, which could properly send them as feedback up to the caller, which can decide how to handle them as they need.

In the meantime, I think I simply have to rely on the errors that would fire after these, because these can't be successfully caught by the main containing function. So I think using a ghost `.catch()` method on them will be the only fix for it, unfortunately.

Maybe I'm missing something about how you should use `write()` and `close()` and maybe they shouldn't be awaited. I'd think that would be key to catching errors properly in situations like these though, no?

Offroaders123/Dovetail#1

For a while, I thought this had to do with my other changes to the Read module (Root ListTags, `return await` try-catch calls, `deflate-raw` auto-detect handling), but turns out it was just the error handling within the Compression module itself, for when piping through a compression stream would have an error, and it simply wan't being cast to the parent scope correctly, because it wasn't being called with `await` (Because it doesn't work in the browser).

Ooh, and the part I said I'd go back to! I think this was happening in Node actually, but only when the thread was running long enough to allow the errors to bubble up to become Uncaught. When I implemented the auto-detect without header checks change, that was using try-catch for each of the compression format types, and errors would magically cause the NEXT compression format to error out, and I couldn't figure out why. Turns out it *was* because Node was exiting in every other case, before those un-awaited Promises were erroring out (because the *header* based checks were faster than when the header-less checks were running), meaning the header-less checks ran long enough for the uncaught errors in the previous auto-detect compression test would bubble up to the main scope as uncaught, and I associated this as only happening with the `deflate-raw` format, because it would only happen with that one, because the other checks would run fast enough to finalize before the errors would bubble up in time.
@Offroaders123
Copy link
Owner

Ok, turns out it does have to do with the writer.write() and writer.close() calls. See this commit message for more intel on this.

@Offroaders123
Copy link
Owner

For a better reference on the error handling for NBTify here, see the demo code in this commit here, referenced from the same commit as above.

Node Promise Results Chrome Promise Results Safari Promise Results Firefox Promise Results

Offroaders123 added a commit to Offroaders123/NBTify that referenced this issue Jun 30, 2023
I think this could be the final resolution to the unhandled Promise streaming issue.

This will fix the issues in implementing the latest version of NBTify into Dovetail.

whatwg/compression#55
Offroaders123/Dovetail#1

https://jakearchibald.com/2023/unhandled-rejections/ (Part of the resolution)
Offroaders123 added a commit that referenced this issue Jul 3, 2023
Bringing Dovetail up to the latest version of NBTify! That means that now there is support for Deflate-Raw compressed files! This update was waiting on fixes for some additional compression errors in NBTify that went unnoticed in the tests I wrote for Node, because Node has a different streams implementation which was behaving differently, and is passing the tests unexpectedly, meaning the extra unhandled errors were only happening in the browser.

#1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants