|
|
--- |
|
|
datasets: |
|
|
- XenArcAI/MathX-5M |
|
|
base_model: |
|
|
- google/gemma-3-1b-it |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# Model Card: Parveshiiii/M1-MathX |
|
|
|
|
|
## Model Details |
|
|
- **Model Name:** Parveshiiii/M1-MathX |
|
|
- **Base Architecture:** Gemma (1B parameters) |
|
|
- **Model Type:** Causal Language Model (text-generation) |
|
|
- **Training Framework:** Hugging Face Transformers |
|
|
- **Precision:** fp16 |
|
|
- **Attention Mechanism:** Hybrid sliding-window and full attention layers |
|
|
- **Tokenizer:** Gemma tokenizer (vocab size 262,144) |
|
|
|
|
|
## Usage |
|
|
```python |
|
|
from transformers import pipeline, TextStreamer |
|
|
|
|
|
pipe = pipeline("text-generation", model="Parveshiiii/M1-MathX") |
|
|
messages = [ |
|
|
{"role": "user", "content": "Who are you?"}, |
|
|
] |
|
|
streamer = TextStreamer(pipe.tokenizer) |
|
|
pipe(messages, streamer=streamer, max_new_tokens=10000) |
|
|
``` |
|
|
## Intended Use |
|
|
- Designed for mathematical reasoning tasks, including problem solving, equation manipulation, and step-by-step derivations. |
|
|
- Suitable for educational contexts, math tutoring, and research experiments in reasoning alignment. |
|
|
- Not intended for general-purpose conversation or sensitive domains outside mathematics. |
|
|
|
|
|
## Training Data |
|
|
- **Dataset:** MathX (curated mathematical reasoning dataset) |
|
|
- **Samples Used:** ~300 |
|
|
- **Training Steps:** 50 |
|
|
- **Method:** GRPO (Group Relative Policy Optimization) fine-tuning |
|
|
- **Objective:** Reinforcement-style alignment for improved reasoning clarity and correctness. |
|
|
|
|
|
## Performance |
|
|
- Demonstrated strong performance on small-scale math problems and symbolic reasoning tasks. |
|
|
- Early benchmarks suggest improved accuracy compared to the base Gemma 1B model on math-specific datasets. |
|
|
- Requires formal evaluation on GSM8K, MATH, and other benchmarks for quantitative comparison. |
|
|
|
|
|
## Limitations |
|
|
- Small dataset and limited training steps mean coverage is narrow. |
|
|
- May overfit to MathX patterns and fail on broader or more complex problems. |
|
|
- Not guaranteed to generalize outside mathematical reasoning. |
|
|
- As a 1B model, capacity is limited compared to larger LLMs. |
|
|
|
|
|
## Ethical Considerations |
|
|
- Intended for safe educational use. |
|
|
- Should not be deployed in high-stakes environments without further validation. |
|
|
- Outputs may contain errors; human oversight is required. |
|
|
|
|
|
## Citation |
|
|
If you use this model, please cite as: |
|
|
``` |
|
|
@misc{Parvesh2025M1MathX, |
|
|
author = {Parvesh Rawal}, |
|
|
title = {Parveshiiii/M1-MathX: A Gemma-1B model fine-tuned on MathX with GRPO}, |
|
|
year = {2025}, |
|
|
howpublished = {\url{https://huggingface.co/Parveshiiii/M1-MathX}} |
|
|
} |
|
|
``` |
|
|
|
|
|
--- |