Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encountered an AssertionError while using UniDepthV2 to predict depth #45

Open
red-liu opened this issue May 22, 2024 · 7 comments
Open

Comments

@red-liu
Copy link

red-liu commented May 22, 2024

I really appreciate your great masterpiece. but I used UniDepthV2 to predict, encountered an AssertionError exception as below:

 File "/home/user/app/app.py", line 23, in <module>
 predictions = model.infer(rgb)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/user/app/unidepth/models/unidepthv2/unidepthv2.py", line 229, in infer
features, tokens = self.pixel_encoder(rgbs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/user/app/unidepth/models/backbones/dinov2.py", line 324, in forward
x = self.prepare_tokens_with_masks(x, masks)
File "/home/user/app/unidepth/models/backbones/dinov2.py", line 312, in prepare_tokens_with_masks
x = x + self.interpolate_pos_encoding(x, w, h)
File "/home/user/app/unidepth/models/backbones/dinov2.py", line 297, in interpolate_pos_encoding
and int(h0) == patch_pos_embed.shape[-1]

my result is: int(w0)= 57,patch_pos_embed.shape[-2]= 57 and int(h0)= 43,patch_pos_embed.shape[-1]= 42

@red-liu
Copy link
Author

red-liu commented May 22, 2024

another a question:if I tranform a picture to less resolution or more resolution,how will the result change?

@lpiccinelli-eth
Copy link
Owner

Thanks for using our work!

Which is your input shape? (or the config you are passing to the mode, like pixels_bounds, etc..)

To answer your question: the results may change a bit, but we expects them to be quite consistent, something that is not typical for previous works, especially in case of metric estimation.

@red-liu
Copy link
Author

red-liu commented May 22, 2024

Thank you very much for your reply.
My input is a picture, its shape is (4032, 3024) and it is from iphone 13, so height is bigger than width.
pixels_bounds had no specific setting because I don't understand the purpose.

@BaderTim
Copy link

hi there, I encounter a similar error with KITTI-shaped images.

@lpiccinelli-eth
Copy link
Owner

Thank you for the info, it looks like when out of bounds of the ratio, it fails, I will check it and get back to you (hopefully) the corrected version.

@lpiccinelli-eth
Copy link
Owner

The error comes from DINO original code and was solved in this PR, we committed the changes and now it should be solved.

Let me know if something is still off.

@red-liu
Copy link
Author

red-liu commented May 24, 2024

It seems the issue has been resolved. Thank you very much for your help. By the way, I've encountered a new problem: there's a significant difference between the intrinsics predictions from version 1 and version 2. Do you have any idea what might be causing this discrepancy?
To the same picture, the intrinsics predictions as below:

v2:
    [[[3.8631e+03, 0.0000e+00, 1.5201e+03],
     [0.0000e+00, 4.0359e+03, 2.0109e+03],
     [0.0000e+00, 0.0000e+00, 1.0000e+00]]]

v1:
 [[[1.6742e+03, 0.0000e+00, 1.5174e+03],
     [0.0000e+00, 2.7255e+03, 2.0222e+03],
     [0.0000e+00, 0.0000e+00, 1.0000e+00]]]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants