Can't quantize
#16
by
vekitan
- opened
I was able to build llama_quantize.exe by applying the patch to the specified branch, but
llama_model_quantize: failed to quantize: unknown model architecture: 'qwenimage'
An error occurs and I cannot quantize Qwen-image model.
How were the models shown here created?
- Change branch to auto_convert. Download full branch and unzip it somewhere.
- Open folder /tools/, do:
git clone https://github.com/ggerganov/llama.cpp llama.cpp.auto - Manually apply patch:
cd llama.cpp.auto
git checkout tags/b3962
git apply ..\lcpp.patch - Do:
mkdir build
cmake -B build -DCMAKE_CXX_STANDARD=17 -DCMAKE_CXX_STANDARD_REQUIRED=ON -DCMAKE_CXX_FLAGS="-std=c++17" - Then edit the llama.cpp.auto\common\log.cpp file, inserts two lines after the existing first line:
#include "log.h"
#define _SILENCE_CXX23_CHRONO_DEPRECATION_WARNING
#include
6. In folder llama.cpp.auto
cmake --build build --config Debug -j10 --target llama-quantize
cd ..
7. Run convertion process:
python tool_auto.py --src [path to source .safetensors] --quants [Q5_K_M (or desirable quants}
Just list 8 hours to figure it out.