You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why is the INT8 quantized model disabled on GPU? I tried the INT8 optimization on CPU and then performed inference using the INT8 model on GPU. Tested working on Arc A310 dGPU and ARL-H iGPU. Can we create such flow for the notebook? Thanks.
Hi @ekurniaw, which openvino version do you use? We limited int8 support for GPU in notebook because at the moment of notebook publication, int8 image encoder usage for GPU leads to inaccurate generation results (the model works itself but the response does not have any related to image info). I got confirmation that recently, it was fixed, so I think yes, we can update notebook to allow int8 image encoder running for GPU with updating openvino in notebook requirements
Why is the INT8 quantized model disabled on GPU? I tried the INT8 optimization on CPU and then performed inference using the INT8 model on GPU. Tested working on Arc A310 dGPU and ARL-H iGPU. Can we create such flow for the notebook? Thanks.
openvino_notebooks/notebooks/mllama-3.2/ov_mllama_compression.py
Line 56 in f9caad1
The text was updated successfully, but these errors were encountered: