Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when i run 0001-Use-KleidiAI-Int4-Matmul-micro-kernels-in-llama.cpp.patch,meet error 'Illegal instruction' #155

Open
qw1319 opened this issue Dec 12, 2024 · 1 comment

Comments

@qw1319
Copy link

qw1319 commented Dec 12, 2024

when i run kldidiai step by step 'https://github.com/ARM-software/ML-examples/tree/main/kleidiai-examples/llama_cpp', then i meeet error like this
KleidiAI: CPU features: -- neon: yes -- dotprod: yes -- i8mm: yes -- sme: no llm_load_tensors: CPU model buffer size = 3647.87 MiB .................................................................................................. llama_new_context_with_model: n_seq_max = 1 llama_new_context_with_model: n_ctx = 2048 llama_new_context_with_model: n_ctx_per_seq = 2048 llama_new_context_with_model: n_batch = 2048 llama_new_context_with_model: n_ubatch = 512 llama_new_context_with_model: flash_attn = 0 llama_new_context_with_model: freq_base = 10000.0 llama_new_context_with_model: freq_scale = 1 llama_new_context_with_model: n_ctx_per_seq (2048) < n_ctx_train (4096) -- the full capacity of the model will not be utilized llama_kv_cache_init: CPU KV buffer size = 1024.00 MiB llama_new_context_with_model: KV self size = 1024.00 MiB, K (f16): 512.00 MiB, V (f16): 512.00 MiB llama_new_context_with_model: CPU output buffer size = 0.12 MiB llama_new_context_with_model: CPU compute buffer size = 164.01 MiB llama_new_context_with_model: graph nodes = 1030 llama_new_context_with_model: graph splits = 1 common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable) Illegal instruction

@AndreasKunar
Copy link

Same here. Followed the instructions for applying the patch on top of llama.cpp commit b8deef0ec0af5febac1d2cfd9119ff330ed0b762, and built natively. Get "Illegal instruction (core dumped)" in llama-cli,...

Hardware: Surface Laptop 7 15", Snadragon X Elite, 16GB RAM
Software: Windows 11 24H2 (Build 26100.2605), WSL2/Ubuntu 24.04 LTS, patched llama.cpp version: 4034 (b8deef0e) built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for aarch64-linux-gnu

On the same HW and WSL2/Linux/gcc, the current version of llama.cpp (version: 4361 (7585edbd)) runs perfectly fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants