"Not all quantized model perform good", serving framework ollama uses NVIDIA gpu, llama.cpp uses CPU with AVX & AMX
v1k
xbruce22
AI & ML interests
None yet
Recent Activity
liked
a model
about 8 hours ago
zai-org/GLM-4.7-FP8
liked
a model
about 9 hours ago
zai-org/GLM-4.7-Flash
liked
a model
7 days ago
zai-org/GLM-Image