Delta-Vector
/

Nanuq-R1-9B

Text Generation

creative-writing

text-generation-inference

Model card Files Files and versions

Delta-Vector commited on Sep 28, 2025

Commit

59a2a8c

·

verified ·

1 Parent(s): b286f7d

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -272,7 +272,9 @@ a:hover .link-arrow {
             </div>
           </div>
           <div class="model-description">
-            <p>A sequel! The new Nanuq series is meant to be as a testing grounds for my GRPO experiments, Built ontop of Austral Xgen 9B, I made an RL env using PrimeIntellect-ai/verifiers and implemented InternLM/POLAR in said env, then using Pocketdoc's Systemmax dataset, I finetuned the model for 150 steps and this was the result.</p>
           </div>
         </div>
       </div>

             </div>
           </div>
           <div class="model-description">
+            <p>A sequel! The new Nanuq series is meant to be as a testing grounds for my GRPO experiments, This model is meant to have great Instruct Following and System prompt Adherence in Creative Scenarios.</p>
+            <p>Built ontop of Austral Xgen 9B, I made an RL env using PrimeIntellect-ai/verifiers and implemented InternLM/POLAR in said env, then using Pocketdoc's Systemmax dataset, I finetuned the model for 150 steps and this was the result.</p>
+            <p>There's alot of things i could do different, As the reward almost falls flat as soon as you get out of warm-up but this model was pretty decent so decided to release it, Hope people enjoy it!</p>
           </div>
         </div>
       </div>