You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you so much for creating and maintaining Kavita. I feel I might be the only one who needs the support of KFX.
KFX is a proprietary format used by Amazon / Kindle ecosystem to serve ebooks, including Graphic Novels and Comics. It is special in that, due to the history of Comixology, it has retained some really high resolution comics that cannot be procured else where legally. For that reason, I purchase primarily from Amazon, and keep the files myself.
Insofar as I know, there is no other KFX comic reader than Kindle itself. I implemented a rudimentary reader myself, although it is far away from the features that Kavita offers.
KFX is an Amazon Ion package. The library has official implementations in many languages, including C# (https://github.com/amazon-ion/ion-dotnet). And it has a 3rd party python implementation (https://github.com/kluyg/calibre-kfx-input). I did a go implementation myself and the general structure (to the degree it's relevant to comic books) and I will provide a rough flow of the structure at the end of the FR.
KFX, in the end, is a container that supports at least:
BMP,GIF,JPG,JXR,PBM,PDF,PNG,PObject,TIFF,BPG,WebP
Although almost all the comics I personally have are PNG or JPG files. Because of the availability of the transcoding feature now, I feel maybe it is more possible now to support the list of formats listed here.
Again thank you!
Regards,
R
The flow of decoding KFX structure is as follows:
I. KFX Container Processing:
Read Container Header:
Read signature (4 bytes) - Verify it's "CONT"
Read version (2 bytes) - Check if it's 1 or 2
Read header length (4 bytes)
Read container info offset (4 bytes)
Read container info length (4 bytes)
Read Container Info (Ion struct):
Deserialize the Ion data at the container info offset into an IonStruct.
Extract:
container_id
compression_type (default 0)
drm_scheme (default 0)
doc_symbol_offset (optional)
doc_symbol_length (optional)
chunk_size (default 4096)
format_capabilities_offset (optional, version > 1)
format_capabilities_length (optional, version > 1)
index_table_offset
index_table_length
Read Document Symbols (Ion annotated value): (Note: this is not important as vast majority of the comic books have an empty internal symbol table)
If doc_symbol_length > 0:
Deserialize the Ion data at doc_symbol_offset as an IonAnnotation.
Verify the annotation is $ion_symbol_table.
Adjust max_id values of imports in symbol table, if they exist, by adding number of system symbol table entries.
Create a new local symbol table based on this data.
Read Format Capabilities (Ion annotated value):
If format_capabilities_length > 0 (and version > 1):
Deserialize the Ion data at format_capabilities_offset as an IonAnnotation.
Verify the annotation is $593.
Read KFXGen Info (JSON): (Note: also mostly irrelevant to Comics books, which are primarily just images)
Extract the JSON string between container info and header end.
Deserialize the JSON to get kfxgen_package_version, kfxgen_application_version, kfxgen_payload_sha1, and kfxgen_acr.
Verify kfxgen_payload_sha1.
Read Index Table: (Note: this is where we read the content of embedded entities from the container)
Deserialize the data at index_table_offset into a list of entity entries:
For each entry:
Read id_idnum (4 bytes)
Read type_idnum (4 bytes)
Read entity_offset (8 bytes) - This is relative to the end of the header.
Read entity_len (8 bytes)
Determine Container Format:
Based on the type_idnums found in the index table entries, or if there were any document symbols, determine if it's KFX_MAIN, KFX_METADATA, or KFX_ATTACHABLE.
Create Container Info Fragment
Create an annotation fragment ($270) which is an Ion struct containing all the metadata extracted earlier, version number, and list of entities in [[type_idnum, id_idnum], ...] format.
Deserialize Entities:
For each entity entry in the index table:
Create a KfxContainerEntity object.
Call deserialize() on the entity.
II. Entity Processing
Read Entity Header:
Read signature (4 bytes) - Verify it's "ENTY"
Read version (2 bytes) - Check if it's 1
Read header length (4 bytes)
Read Entity Info (Ion struct):
Deserialize the Ion data at the beginning of the entity (up to header length) into an IonStruct.
Extract:
compression_type (default 0)
drm_scheme (default 0)
Read Entity Data:
Extract the remaining bytes after the header as entity_data.
Deserialize Entity Data (based on type):
Get fid (field ID) and ftype (fragment type) from the symbol table using id_idnum and type_idnum.
If ftype is in RAW_FRAGMENT_TYPES:
Treat entity_data as an IonBLOB.
Otherwise:
Deserialize entity_data using IonBinary.deserialize_single_value().
If the deserialized value is an IonAnnotation:
If the annotation is ftype and fid is $348, replace the value with the annotation's inner value and update fid to ftype.
III. Ion Binary Deserialization (Note: this is the general function to deserialize)
Read Descriptor:
Read the first byte as the descriptor.
Extract signature (top 4 bits) and flag (bottom 4 bits).
Determine Length:
If flag is VARIABLE_LEN_FLAG (14):
Read a variable-length unsigned integer (deserialize_vluint) to get the length.
Otherwise:
Length is equal to flag.
Deserialize based on Signature:
Use VALUE_DESERIALIZERS to find the appropriate deserialization function based on signature.
Call the function, passing the flag (or length) and the Deserializer object.
Idea Category
Feature Enhancement
Duration of Using Kavita
No response
Before submitting
I've already searched for existing ideas before posting.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Idea Description
Hi,
Thank you so much for creating and maintaining Kavita. I feel I might be the only one who needs the support of KFX.
KFX is a proprietary format used by Amazon / Kindle ecosystem to serve ebooks, including Graphic Novels and Comics. It is special in that, due to the history of Comixology, it has retained some really high resolution comics that cannot be procured else where legally. For that reason, I purchase primarily from Amazon, and keep the files myself.
Insofar as I know, there is no other KFX comic reader than Kindle itself. I implemented a rudimentary reader myself, although it is far away from the features that Kavita offers.
KFX is an Amazon Ion package. The library has official implementations in many languages, including C# (https://github.com/amazon-ion/ion-dotnet). And it has a 3rd party python implementation (https://github.com/kluyg/calibre-kfx-input). I did a go implementation myself and the general structure (to the degree it's relevant to comic books) and I will provide a rough flow of the structure at the end of the FR.
KFX, in the end, is a container that supports at least:
BMP,GIF,JPG,JXR,PBM,PDF,PNG,PObject,TIFF,BPG,WebP
Although almost all the comics I personally have are PNG or JPG files. Because of the availability of the transcoding feature now, I feel maybe it is more possible now to support the list of formats listed here.
Again thank you!
Regards,
R
The flow of decoding KFX structure is as follows:
I. KFX Container Processing:
Read Container Header:
Read signature (4 bytes) - Verify it's "CONT"
Read version (2 bytes) - Check if it's 1 or 2
Read header length (4 bytes)
Read container info offset (4 bytes)
Read container info length (4 bytes)
Read Container Info (Ion struct):
Deserialize the Ion data at the container info offset into an IonStruct.
Extract:
container_id
compression_type (default 0)
drm_scheme (default 0)
doc_symbol_offset (optional)
doc_symbol_length (optional)
chunk_size (default 4096)
format_capabilities_offset (optional, version > 1)
format_capabilities_length (optional, version > 1)
index_table_offset
index_table_length
Read Document Symbols (Ion annotated value): (Note: this is not important as vast majority of the comic books have an empty internal symbol table)
If doc_symbol_length > 0:
Deserialize the Ion data at doc_symbol_offset as an IonAnnotation.
Verify the annotation is $ion_symbol_table.
Adjust max_id values of imports in symbol table, if they exist, by adding number of system symbol table entries.
Create a new local symbol table based on this data.
Read Format Capabilities (Ion annotated value):
If format_capabilities_length > 0 (and version > 1):
Deserialize the Ion data at format_capabilities_offset as an IonAnnotation.
Verify the annotation is $593.
Read KFXGen Info (JSON): (Note: also mostly irrelevant to Comics books, which are primarily just images)
Extract the JSON string between container info and header end.
Deserialize the JSON to get kfxgen_package_version, kfxgen_application_version, kfxgen_payload_sha1, and kfxgen_acr.
Verify kfxgen_payload_sha1.
Read Index Table: (Note: this is where we read the content of embedded entities from the container)
Deserialize the data at index_table_offset into a list of entity entries:
For each entry:
Read id_idnum (4 bytes)
Read type_idnum (4 bytes)
Read entity_offset (8 bytes) - This is relative to the end of the header.
Read entity_len (8 bytes)
Determine Container Format:
Based on the type_idnums found in the index table entries, or if there were any document symbols, determine if it's KFX_MAIN, KFX_METADATA, or KFX_ATTACHABLE.
Create Container Info Fragment
Create an annotation fragment ($270) which is an Ion struct containing all the metadata extracted earlier, version number, and list of entities in [[type_idnum, id_idnum], ...] format.
Deserialize Entities:
For each entity entry in the index table:
Create a KfxContainerEntity object.
Call deserialize() on the entity.
II. Entity Processing
Read Entity Header:
Read signature (4 bytes) - Verify it's "ENTY"
Read version (2 bytes) - Check if it's 1
Read header length (4 bytes)
Read Entity Info (Ion struct):
Deserialize the Ion data at the beginning of the entity (up to header length) into an IonStruct.
Extract:
compression_type (default 0)
drm_scheme (default 0)
Read Entity Data:
Extract the remaining bytes after the header as entity_data.
Deserialize Entity Data (based on type):
Get fid (field ID) and ftype (fragment type) from the symbol table using id_idnum and type_idnum.
If ftype is in RAW_FRAGMENT_TYPES:
Treat entity_data as an IonBLOB.
Otherwise:
Deserialize entity_data using IonBinary.deserialize_single_value().
If the deserialized value is an IonAnnotation:
If the annotation is ftype and fid is $348, replace the value with the annotation's inner value and update fid to ftype.
III. Ion Binary Deserialization (Note: this is the general function to deserialize)
Read Descriptor:
Read the first byte as the descriptor.
Extract signature (top 4 bits) and flag (bottom 4 bits).
Determine Length:
If flag is VARIABLE_LEN_FLAG (14):
Read a variable-length unsigned integer (deserialize_vluint) to get the length.
Otherwise:
Length is equal to flag.
Deserialize based on Signature:
Use VALUE_DESERIALIZERS to find the appropriate deserialization function based on signature.
Call the function, passing the flag (or length) and the Deserializer object.
Idea Category
Feature Enhancement
Duration of Using Kavita
No response
Before submitting
Beta Was this translation helpful? Give feedback.
All reactions