Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] Whisper with openblas #52

Open
bil-ash opened this issue Jan 22, 2024 · 3 comments
Open

[feature request] Whisper with openblas #52

bil-ash opened this issue Jan 22, 2024 · 3 comments

Comments

@bil-ash
Copy link

bil-ash commented Jan 22, 2024

First of all thanks for this cool project. Now since you have started adding support for models other than stable diffusion, please also add support for whisper with W8A8 quantization.
Also, seems xnnpack is for speeding up float operations. So does that mean for W8A8 inference xnnpack is not required?
Also, consider adding openblas as a drop-in replacement for cublas so that gpu acceleration can also be used on intel and AMD CPUs with integrated graphics.

@vitoplantamura
Copy link
Owner

vitoplantamura commented Jan 27, 2024 via email

@bil-ash
Copy link
Author

bil-ash commented Jan 28, 2024

Okay, got the point regarding XNNPACK.

I guess you should implement hipBLAS first because you will have to make minimal changes. Just please allow overiding the GPU using AMDGPU_TARGETS like llama.cpp . Actually, I have a machine with AMD GPU which is not officially supported by hipBLAS and overriding helps me to run llama.cpp (with better performance than CPU) but I can't run anything which dosen't support overriding GPU , for example the onnxstream llm demo with GPU. So, I was asking for OpenBLAS support.

insanely-fast-whisper is aimed for servers with GPU. You could aim for CPU. Also, till now whatever whisper quantization for CPU inference implementations I have seen, all do only weight quantization. You could do both weight and activation quantization thereby reducing disk and memory usage.

@vitoplantamura
Copy link
Owner

vitoplantamura commented Jan 28, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants