Trying to setup Llama3-8B-1.58-100B-tokens with i2_s #122

yujiapingff · 2024-11-22T16:28:59Z

I manually downloaded the model and set the model with the command "python setup_env.py -md .\models\Llama3-8B-1.58-100B-tokens -q i2_s" in Windows 11 OS. The result shows:

"ERROR:root:Error occurred while running command: Command '['./build/bin/Release/llama-quantize', '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-f32.gguf', '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-i2_s.gguf', 'I2_S', '1']' returned non-zero exit status 1., check details in logs\quantize_to_i2s.log"

After I check the log, it says:
"main: build = 22 (bf11a49)
main: built with Clang 18.1.8 for Win32
main: quantizing '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-f32.gguf' to '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-i2_s.gguf' as I2_S using 1 threads
llama_model_quantize: failed to quantize: tensor 'output.weight' data is not within the file bounds, model is corrupted or incomplete
main: failed to quantize model from '.\models\Llama3-8B-1.58-100B-tokens\ggml-model-f32.gguf'"

Am I downloading the wrong version of Llama3-8B-1.58-100B-tokens, or are other operations I made wrong?

The text was updated successfully, but these errors were encountered:

chintan-ushur · 2024-11-23T18:51:49Z

Same here:

Key issue appears to be:

GGML_ASSERT((qs.n_attention_wv == n_attn_layer) && "n_attention_wv is unexpected") failed

--

Update: The model was not downloaded correctly, I re-ran "python setup_env.py --hf-repo HF1BitLLM/Llama3-8B-1.58-100B-tokens -q i2_s" and all went well/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to setup Llama3-8B-1.58-100B-tokens with i2_s #122

Trying to setup Llama3-8B-1.58-100B-tokens with i2_s #122

yujiapingff commented Nov 22, 2024

chintan-ushur commented Nov 23, 2024 •

edited

Loading

Trying to setup Llama3-8B-1.58-100B-tokens with i2_s #122

Trying to setup Llama3-8B-1.58-100B-tokens with i2_s #122

Comments

yujiapingff commented Nov 22, 2024

chintan-ushur commented Nov 23, 2024 • edited Loading

chintan-ushur commented Nov 23, 2024 •

edited

Loading