Microcoder 1.5B
Microcoder 1.5B is a code-focused language model fine-tuned from Qwen 2.5 Coder 1.5B Instruct using LoRA (Low-Rank Adaptation) on curated code datasets. It is designed for code generation, completion, and instruction-following tasks in a lightweight, efficient package.
Model Details
| Property | Value |
|---|---|
| Base Model | Qwen 2.5 Coder 1.5B Instruct |
| Fine-tuning | LoRA |
| Parameters | ~1.5B |
| License | BSD 3-Clause |
| Language | English (primary), multilingual code |
| Task | Code generation, completion, instruction following |
Benchmarks
| Benchmark | Metric | Score |
|---|---|---|
| HumanEval | pass@1 | 59.15% |
| MBPP+ | pass@1 | 52.91% |
HumanEval and MBPP+ results were obtained using the model in GGUF format with Q5_K_M quantization. Results may vary slightly with other formats or quantization levels.
Usage
Important: You must use
apply_chat_templatewhen formatting inputs. Passing raw text directly to the tokenizer will produce incorrect results.
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "your-org/microcoder-1.5b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
messages = [
{
"role": "user",
"content": "Write a Python function that returns the nth Fibonacci number."
}
]
input_text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Microcoder 1.5B was fine-tuned using LoRA on top of Qwen 2.5 Coder 1.5B Instruct. The training focused on code-heavy datasets covering multiple programming languages and problem-solving scenarios, aiming to improve instruction-following and code correctness at a small model scale.
Credits
- Model credits — see
MODEL_CREDITS.md - Dataset credits — see
DATASET_CREDITS.md
License
The Microcoder 1.5B model weights and associated code in this repository are released under the BSD 3-Clause License. See LICENSE for details.
Note that the base model (Qwen 2.5 Coder 1.5B Instruct) and the datasets used for fine-tuning are subject to their own respective licenses, as detailed in the credit files above.
Notice
The documentation files in this repository (including README.md, MODEL_CREDITS.md, DATASET_CREDITS.md, and other .md files) were generated with the assistance of an AI language model.
- Downloads last month
- 630
Model tree for pedrodev2026/microcoder-1.5b
Base model
Qwen/Qwen2.5-1.5B