Can't quantize

#16

by vekitan - opened Oct 30, 2025

Discussion

vekitan

Oct 30, 2025

I was able to build llama_quantize.exe by applying the patch to the specified branch, but

llama_model_quantize: failed to quantize: unknown model architecture: 'qwenimage'

An error occurs and I cannot quantize Qwen-image model.
How were the models shown here created?

novarchibald

about 14 hours ago

Change branch to auto_convert. Download full branch and unzip it somewhere.
Open folder /tools/, do:
git clone https://github.com/ggerganov/llama.cpp llama.cpp.auto
Manually apply patch:
cd llama.cpp.auto
git checkout tags/b3962
git apply ..\lcpp.patch
Do:
mkdir build
cmake -B build -DCMAKE_CXX_STANDARD=17 -DCMAKE_CXX_STANDARD_REQUIRED=ON -DCMAKE_CXX_FLAGS="-std=c++17"
Then edit the llama.cpp.auto\common\log.cpp file, inserts two lines after the existing first line:

#include "log.h"

#define _SILENCE_CXX23_CHRONO_DEPRECATION_WARNING
#include
6. In folder llama.cpp.auto
cmake --build build --config Debug -j10 --target llama-quantize
cd ..
7. Run convertion process:
python tool_auto.py --src [path to source .safetensors] --quants [Q5_K_M (or desirable quants}

Just list 8 hours to figure it out.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment