You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I manually downloaded the model and set the model with the command "python setup_env.py -md .\models\Llama3-8B-1.58-100B-tokens -q i2_s" in Windows 11 OS. The result shows:
"ERROR:root:Error occurred while running command: Command '['./build/bin/Release/llama-quantize', '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-f32.gguf', '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-i2_s.gguf', 'I2_S', '1']' returned non-zero exit status 1., check details in logs\quantize_to_i2s.log"
After I check the log, it says:
"main: build = 22 (bf11a49)
main: built with Clang 18.1.8 for Win32
main: quantizing '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-f32.gguf' to '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-i2_s.gguf' as I2_S using 1 threads
llama_model_quantize: failed to quantize: tensor 'output.weight' data is not within the file bounds, model is corrupted or incomplete
main: failed to quantize model from '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-f32.gguf'"
Am I downloading the wrong version of Llama3-8B-1.58-100B-tokens, or are other operations I made wrong?
The text was updated successfully, but these errors were encountered:
GGML_ASSERT((qs.n_attention_wv == n_attn_layer) && "n_attention_wv is unexpected") failed
--
Update: The model was not downloaded correctly, I re-ran "python setup_env.py --hf-repo HF1BitLLM/Llama3-8B-1.58-100B-tokens -q i2_s" and all went well/
I manually downloaded the model and set the model with the command "python setup_env.py -md .\models\Llama3-8B-1.58-100B-tokens -q i2_s" in Windows 11 OS. The result shows:
"ERROR:root:Error occurred while running command: Command '['./build/bin/Release/llama-quantize', '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-f32.gguf', '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-i2_s.gguf', 'I2_S', '1']' returned non-zero exit status 1., check details in logs\quantize_to_i2s.log"
After I check the log, it says:
"main: build = 22 (bf11a49)
main: built with Clang 18.1.8 for Win32
main: quantizing '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-f32.gguf' to '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-i2_s.gguf' as I2_S using 1 threads
llama_model_quantize: failed to quantize: tensor 'output.weight' data is not within the file bounds, model is corrupted or incomplete
main: failed to quantize model from '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-f32.gguf'"
Am I downloading the wrong version of Llama3-8B-1.58-100B-tokens, or are other operations I made wrong?
The text was updated successfully, but these errors were encountered: