Microcoder 1.5B

Microcoder 1.5B is a code-focused language model fine-tuned from Qwen 2.5 Coder 1.5B Instruct using LoRA (Low-Rank Adaptation) on curated code datasets. It is designed for code generation, completion, and instruction-following tasks in a lightweight, efficient package.


Model Details

Property Value
Base Model Qwen 2.5 Coder 1.5B Instruct
Fine-tuning LoRA
Parameters ~1.5B
License BSD 3-Clause
Language English (primary), multilingual code
Task Code generation, completion, instruction following

Benchmarks

Benchmark Metric Score
HumanEval pass@1 59.15%
MBPP+ pass@1 52.91%

HumanEval and MBPP+ results were obtained using the model in GGUF format with Q5_K_M quantization. Results may vary slightly with other formats or quantization levels.


Usage

Important: You must use apply_chat_template when formatting inputs. Passing raw text directly to the tokenizer will produce incorrect results.

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "your-org/microcoder-1.5b"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

messages = [
    {
        "role": "user",
        "content": "Write a Python function that returns the nth Fibonacci number."
    }
]

input_text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Microcoder 1.5B was fine-tuned using LoRA on top of Qwen 2.5 Coder 1.5B Instruct. The training focused on code-heavy datasets covering multiple programming languages and problem-solving scenarios, aiming to improve instruction-following and code correctness at a small model scale.


Credits


License

The Microcoder 1.5B model weights and associated code in this repository are released under the BSD 3-Clause License. See LICENSE for details.

Note that the base model (Qwen 2.5 Coder 1.5B Instruct) and the datasets used for fine-tuning are subject to their own respective licenses, as detailed in the credit files above.


Notice

The documentation files in this repository (including README.md, MODEL_CREDITS.md, DATASET_CREDITS.md, and other .md files) were generated with the assistance of an AI language model.

Downloads last month
630
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 2 Ask for provider support

Model tree for pedrodev2026/microcoder-1.5b

Finetuned
(42)
this model
Quantizations
2 models

Dataset used to train pedrodev2026/microcoder-1.5b