Trying to quantize granite-3B
#1
by Interpause - opened
The 3B and 8B variants are just Llama models, architecturally. Only 20B and up are GPTBigCode. ExLlama should automatically detect it based on the architecture string in config.json.
The 3B and 8B variants are just Llama models, architecturally. Only 20B and up are GPTBigCode. ExLlama should automatically detect it based on the architecture string in config.json.