-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Illegal instruction (core dumped) #15
Comments
Could you please share the configs you are using for this model? |
Btw I just built different images for different BLAS backends:
Could you please let me know if that helps? |
I tried those images, and all still resulted in the illegal instruction. Thanks for the extra images to test with. If I can find some cycles, I will clone the repo, investigate, and submit a push to correct it. I believe the issue is that openblas is not compiled with runtime CPU detection and assumes the instructions available in the CPU. With a modern CPU this isn't an issue, but I'm using an old CPU, my fault; it's what I have - no AVX flag which I think is the culprit. $ cat /proc/cpuinfo For the record: config
What worked for meIf I start up a container using the image
At the shell prompt in the docker container.
Adjust this in the config and use a different model as the latest llama-cpp-python doesn't use GGML but GGUF
Then it starts without enabling BLAS - it's slow, but it works.
That might give you some clues. |
I presume there is a minimum CPU requirement like needing AVX2, AVX-512, FP16C or something?
Could you document the minimum instruction set and extensions required.
root@1d1c4289f303:/llm-api# python app/main.py
2023-10-26 23:31:19,237 - INFO - llama - found an existing model /models/llama_601507219781/ggml-model-q4_0.bin
2023-10-26 23:31:19,237 - INFO - llama - setup done successfully for /models/llama_601507219781/ggml-model-q4_0.bin
Illegal instruction (core dumped)
root@1d1c4289f303:/llm-api#
--- modulename: llama, funcname: init
llama.py(289): self.verbose = verbose
llama.py(291): self.numa = numa
llama.py(292): if not Llama.__backend_initialized:
llama.py(293): if self.verbose:
llama.py(294): llama_cpp.llama_backend_init(self.numa)
--- modulename: llama_cpp, funcname: llama_backend_init
llama_cpp.py(475): return _lib.llama_backend_init(numa)
Illegal instruction (core dumped)
I assume this has CPU requirements.
ENV CMAKE_ARGS "-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS"
OpenBLAS can be built for multiple targets with runtime detection of the target cpu by specifiying DYNAMIC_ARCH=1 in Makefile.rule, on the gmake command line or as -DDYNAMIC_ARCH=TRUE in cmake.
https://github.com/OpenMathLib/OpenBLAS/blob/develop/README.md
The text was updated successfully, but these errors were encountered: