Update README.md
Browse files
README.md
CHANGED
|
@@ -19,14 +19,14 @@ library_name: transformers
|
|
| 19 |
|
| 20 |
**DeepBrainz-R1-4B-40K** is a compact, high-performance reasoning model engineered by **DeepBrainz AI & Labs**. It is part of the **DeepBrainz-R1 Series**, designed to deliver frontier-class reasoning capabilities in cost-effective parameter sizes.
|
| 21 |
|
| 22 |
-
This specific variant offers a **40,960 token context window**, making it suitable for
|
| 23 |
|
| 24 |
---
|
| 25 |
|
| 26 |
## 🚀 Model Highlights
|
| 27 |
|
| 28 |
- **Parameter Count:** ~4B
|
| 29 |
-
- **Context Window:** 40,960 tokens
|
| 30 |
- **Context Type:** Extended (RoPE)
|
| 31 |
- **Specialization:** STEM Reasoning, Logic, Code Analysis
|
| 32 |
- **Architecture:** Optimized Dense Transformer
|
|
@@ -41,7 +41,8 @@ This specific variant offers a **40,960 token context window**, making it suitab
|
|
| 41 |
- **Code Generation:** Writing and debugging algorithms.
|
| 42 |
- **Structured Data Extraction:** Parsing and reasoning over unstructured text.
|
| 43 |
|
| 44 |
-
> **Note:** This is a
|
|
|
|
| 45 |
|
| 46 |
---
|
| 47 |
|
|
@@ -70,9 +71,9 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
| 70 |
|
| 71 |
## 🏗️ Technical Summary
|
| 72 |
|
| 73 |
-
|
| 74 |
|
| 75 |
-
*
|
| 76 |
|
| 77 |
---
|
| 78 |
|
|
|
|
| 19 |
|
| 20 |
**DeepBrainz-R1-4B-40K** is a compact, high-performance reasoning model engineered by **DeepBrainz AI & Labs**. It is part of the **DeepBrainz-R1 Series**, designed to deliver frontier-class reasoning capabilities in cost-effective parameter sizes.
|
| 21 |
|
| 22 |
+
This specific variant offers a **40,960 token context window**, making it suitable for extended-context evaluation and repository-level code reasoning.
|
| 23 |
|
| 24 |
---
|
| 25 |
|
| 26 |
## 🚀 Model Highlights
|
| 27 |
|
| 28 |
- **Parameter Count:** ~4B
|
| 29 |
+
- **Context Window:** up to 40,960 tokens (extended context; experimental)
|
| 30 |
- **Context Type:** Extended (RoPE)
|
| 31 |
- **Specialization:** STEM Reasoning, Logic, Code Analysis
|
| 32 |
- **Architecture:** Optimized Dense Transformer
|
|
|
|
| 41 |
- **Code Generation:** Writing and debugging algorithms.
|
| 42 |
- **Structured Data Extraction:** Parsing and reasoning over unstructured text.
|
| 43 |
|
| 44 |
+
> **Note:** This is a post-trained reasoning variant intended for evaluation and experimentation.
|
| 45 |
+
> It is not production-validated and is not optimized for open-ended conversational chat.
|
| 46 |
|
| 47 |
---
|
| 48 |
|
|
|
|
| 71 |
|
| 72 |
## 🏗️ Technical Summary
|
| 73 |
|
| 74 |
+
This model has undergone **post-training** to improve structured reasoning behavior, mathematical problem solving, and robustness in agentic workflows.
|
| 75 |
|
| 76 |
+
*Detailed post-training recipes and dataset compositions are not fully disclosed.*
|
| 77 |
|
| 78 |
---
|
| 79 |
|