Can't quantize

#16
by vekitan - opened

I was able to build llama_quantize.exe by applying the patch to the specified branch, but

llama_model_quantize: failed to quantize: unknown model architecture: 'qwenimage'

An error occurs and I cannot quantize Qwen-image model.
How were the models shown here created?

  1. Change branch to auto_convert. Download full branch and unzip it somewhere.
  2. Open folder /tools/, do:
    git clone https://github.com/ggerganov/llama.cpp llama.cpp.auto
  3. Manually apply patch:
    cd llama.cpp.auto
    git checkout tags/b3962
    git apply ..\lcpp.patch
  4. Do:
    mkdir build
    cmake -B build -DCMAKE_CXX_STANDARD=17 -DCMAKE_CXX_STANDARD_REQUIRED=ON -DCMAKE_CXX_FLAGS="-std=c++17"
  5. Then edit the llama.cpp.auto\common\log.cpp file, inserts two lines after the existing first line:

#include "log.h"

#define _SILENCE_CXX23_CHRONO_DEPRECATION_WARNING
#include
6. In folder llama.cpp.auto
cmake --build build --config Debug -j10 --target llama-quantize
cd ..
7. Run convertion process:
python tool_auto.py --src [path to source .safetensors] --quants [Q5_K_M (or desirable quants}

Just list 8 hours to figure it out.

Sign up or log in to comment