README.md · GilbertAkham/deepseek-R1-multitask-lora at main

File size: 2,767 Bytes

---
language: 
  - en
license: apache-2.0
tags:
  - text-generation
  - instruction-tuning
  - multi-task
  - reasoning
  - email
  - summarization
  - chat
  - peft
  - lora
  - qwen
  - deepseek
base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
datasets:
  - HuggingFaceTB/smoltalk
  - snoop2head/enron_aeslc_emails
  - lucadiliello/STORIES
  - abisee/cnn_dailymail
  - wiki40b
model_type: causal-lm
inference: true
library_name: peft
pipeline_tag: text-generation
---

# 🧠 Deepseek-R1-multitask-lora

**Author:** Gilbert Akham  
**License:** Apache-2.0  
**Base model:** [`deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)  
**Adapter type:** LoRA (PEFT)  
**Capabilities:** Multi-task generalization & reasoning  

---

# 🚀 What It Can Do

This multitask fine-tuned model handles a broad set of natural language and reasoning-based tasks, such as:

✉️ Email & message writing — generate clear, friendly, or professional communications.

📖 Story & creative writing — craft imaginative narratives, poems, and dialogues.

💬 Conversational chat — maintain coherent, context-aware conversations.

💡 Explanations & tutoring — explain technical or abstract topics simply.

🧩 Reasoning & logic tasks — provide step-by-step answers for analytical questions.

💻 Code generation & explanation — write and explain Python or general programming code.

🌍 Translation & summarization — translate between multiple languages or condense information.

The model’s multi-domain training (based on datasets like SmolTalk, Everyday Conversations, and reasoning-rich samples) makes it suitable for assistants, chatbots, content generators, or educational tools.
---

## 🧩 Training Details

| Parameter | Value |
|------------|-------|
| Base model | `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B` |
| Adapter | LoRA (r=8, alpha=32, dropout=0.1) |
| Max sequence length | 1024 |
| Learning rate | 3e-5 (cosine decay) |
| Optimizer | `adamw_8bit` |
| Grad Accumulation | 4 |
| Precision | 4-bit quantized, FP16 compute |
| Steps | 12k total (best @ ~8.2k) |
| Training time | ~2.5h on A4000 |
| Frameworks | 🤗 Transformers, PEFT, TRL, BitsAndBytes |

---

## 🧠 Reasoning Capability

Thanks to integration of **SmolTalk** and diverse multi-task prompts, the model learns:
- **Chain-of-thought style reasoning**
- **Conversational grounding**
- **Multi-step logical inferences**
- **Instruction following** across domains

Example:
```text
### Task: Explain reasoning

### Input:
If a train leaves City A at 3 PM and arrives at City B at 6 PM, covering 180 km, what is its average speed?

### Output:
The train travels 180 km in 3 hours. 
Average speed = 180 ÷ 3 = 60 km/h.