-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct usage of model? #4
Comments
Update: I have figured out how to run the model without errors (the input images need to have the same number of channels as bits per pixel, the images should be preprocessed using I also tried some of the sample eval arguments on the provided dataset and received the following error in the structural_similarity function: |
Hi Thomas, Thank you for your interest in our work. You can refer to this Colab notebook for running model inference. Please feel free to reach out if you have any further questions. |
Hi, Is the model's size and architecture capable of achieving higher accuracy in matching the cover image? Apologies if these are silly questions, I'm still somewhat new to model training in general 😅 |
Hi Thomas, Thank you for your feedback. I appreciate your efforts in experimenting with the model and making adjustments. Firstly, it's important to note that image steganography with JPEG compression presents additional challenges compared to PNG encoding. JPEG's lossy compression, which removes high-frequency components to reduce file size, inherently makes it more difficult to preserve the hidden message without visible distortions. For improved image quality, I would recommend using LISO-PNG (the default setting) models.
Evaluated on div2k validation set with 1 bit encoded in each pixel. Regarding the results of your training, in our experiments we do not observe substantial performance gain from increased dataset size or number of training epochs. However, the trade-off between image quality and decoding accuracy and be controlled with the Please feel free to reach out if you require further clarification. |
Ah, I see. My main interest was in the jpeg mode, as that is in my opinion the main thing setting it apart from purely deterministic steganography methods like low bitplane substitution, and I was mostly curious whether the model would be capable of learning a way to distribute data more imperceptibly taking advantage of existing image content; such as within higher noise areas, along edges and colour transitions, etc. But I suppose after looking through the existing code, that also complicates the training process because how the image is perceived by humans is not quite the same as what MSE, PSNR or other metrics represent. As usual, I appreciate the quick replies! |
First of all, let me say that this is a really cool project!
I wanted to try and test inference on single files (to see if it can be integrated in a couple projects of mine) and while I was able to get an output that resembled both the cover and data inputs, I think I'm doing something wrong as the output is very colour distorted, and sometimes depending on the content of the image (I made sure to keep image size consistent) it sometimes gives the following error or a variant of it in the encoder.forward -> conv2d step:
RuntimeError: Given groups=1, weight of size [32, 33, 3, 3], expected input[1, 35, 512, 512] to have 33 channels, but got 35 channels instead
Here's the code I used to inference the model:
Let me know if I should be doing something different here. Thanks!
The text was updated successfully, but these errors were encountered: