LLM Compressor testing - a nm-testing Collection

nm-testing 's Collections

KV Cache Quantization

FP8-Block Quantized Models

LLM Compressor testing

Speculators testing

Sparse-Llama-3.1-8B-2of4

LLM Compressor testing

updated Nov 17, 2025

nm-testing/tinysmokellama-3.2

354k • Updated Sep 17, 2025 • 55.8k
nm-testing/llama2.c-stories42M-pruned2.4

Updated Oct 29, 2025 • 850
nm-testing/tinyllama-fp8-dynamic-compressed

1B • Updated Oct 9, 2024 • 408
nm-testing/tinyllama-w4a16-compressed

0.3B • Updated Oct 9, 2024 • 679
nm-testing/tinyllama-w8a8-compressed

1B • Updated Oct 9, 2024 • 752
nm-testing/tinyllama-w8a16-dense

1B • Updated 25 days ago • 321
nm-testing/TinyLlama-1.1B-Chat-v1.0-FP8-Dynamic-compressed

1B • Updated Jan 14, 2025 • 499
nm-testing/TinyLlama-1.1B-Chat-v1.0-FP8-Dynamic-uncompressed

1B • Updated Jan 14, 2025 • 207
nm-testing/TinyLlama-1.1B-Chat-v1.0-W4A16-G128-compressed

0.3B • Updated Jan 14, 2025 • 120
nm-testing/TinyLlama-1.1B-Chat-v1.0-W4A16-G128-uncompressed

1B • Updated Jan 14, 2025 • 108
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A8-Dynamic-Per-Token-compressed

1B • Updated Jan 14, 2025 • 118
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A8-Dynamic-Per-Token-uncompressed

1B • Updated Jan 14, 2025 • 98
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A16-G128-compressed

0.4B • Updated Jan 14, 2025 • 512
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A16-G128-uncompressed

1B • Updated Jan 14, 2025 • 203