if you like it give the demo a little star and send a shoutout to : @MaxLSB@jddqd and @GAD-cell for absolutely obliterating the pareto frontier of the french language understanding .
My team recently won 1st place in the BEHAVIOR Challenge at NeurIPS. The competition focused on training a single policy to complete 50 long-horizon household tasks in simulation.
We built an end-to-end policy based on Pi0.5 with a bunch of custom modifications. Everything is open-sourced, and it should be useful for anyone exploring VLAs or adapting them to specific tasks.
Key Architecture Changes: - Replaced language model with 50 trainable task embeddings (no text at all) - Correlated noise for Flow Matching: ϵ ∼ N(0, 0.5I + 0.5Σ) using dataset action covariance - Learnable mixed-layer attention: each action expert layer attends to a trainable mix of all VLM layers - System 2 stage tracking: model predicts task stage, we smooth it with voting and feed it back as context
Training: - Multi-sample Flow Matching: 15 FM samples per VLM pass to reduce gradient variance - Delta action space + per-timestamp normalization - FAST auxiliary loss and stage prediction loss - Trained on 224×224 RGB + proprioception only - We use 4 fine-tuned checkpoints, all derived from a multi-task model trained on all 50 tasks
Inference Optimizations: - Soft inpainting: predict 30 actions, execute 26, use 4 as an input for the next chunk - Correlation-aware guidance of inpainting to keep action chunks smooth - 1.3× speedup via cubic spline compression - General correction rule: reopen gripper after failed grasps
Eduhelp with more empathy, based on model finetuned on psychotheraputic preferences just landed on
Beck-8B as a base model, 13000 steps on educational dataset. Time to go further and build more 🥰 s3nh/EduHelp_Beck_8B Thanks to @basilic_ai for computations <3
Just tried to create an educational assistant for younger people who can struggle with visualsation of 'what is this sorcery all about'. Its first step of my spare time projects, sft on Qwen3-8B,
EduHelper is a child-friendly tutoring assistant fine-tuned from the Qwen3-8B base model using parameter-efficient fine-tuning (PEFT) with LoRA on the ajibawa-2023/Education-Young-Children dataset.
The purpose here is to get an idea of the profile of the models with the greatest impact in open source (we are not interested in closed models here!).
> Quality ≈ 3–4B dense, yet faster than Qwen3-1.7B > MoE designed to run on phones/laptops (llama.cpp / vLLM) > Pre-trained on 12T tokens → strong math/code/IF