AI & ML interests

Local LLMs

Recent Activity

prithivMLmodsย 
posted an update about 10 hours ago
view post
Post
598
Flux-Klein-KV-Edit-Consistency demo is now available on Spaces. It preserves character identity and delivers high-quality, realistic results after edits. No need for any special prompts, just upload the image, type your prompt, and get the resulting image blazing fast.

๐Ÿ”ฅ Demo Space: prithivMLmods/flux-klein-kv-edit-consistency
๐Ÿค— Model: black-forest-labs/FLUX.2-klein-9b-kv
๐Ÿค— Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
๐Ÿ”— Gradio Server Mode: https://www.gradio.app/main/guides/server-mode

โž” Built with Headless Gradio, an alternative to using gr.Blocks for creating the frontend and triggering events, powered by FastAPI + Gradio. You can now design the frontend however you want, with continued support for APIs, MCP, and ZeroGPU.

โž” Gradio Server Mode is now available from gradio@v6.10.0.

To learn more, visit the app page or the respective model pages.
Parveshiiiiย 
posted an update 2 days ago
view post
Post
2304
Just did something Iโ€™ve been meaning to try for ages.

In only 3 hours, on 10 billion+ tokens, I trained a custom BPE + tiktoken-style tokenizer using my new library microtok โ€” and it hits the same token efficiency as Qwen3.

Tokenizers have always felt like black magic to me. We drop them into every LLM project, but actually training one from scratch? That always seemed way too complicated.

Turns out it doesnโ€™t have to be.

microtok makes the whole process stupidly simple โ€” literally just 3 lines of code. No heavy setup, no GPU required. I built it on top of the Hugging Face tokenizers library so it stays clean, fast, and actually understandable.

If youโ€™ve ever wanted to look under the hood and build your own optimized vocabulary instead of just copying someone elseโ€™s, this is the entry point youโ€™ve been waiting for.

I wrote up the full story, threw in a ready-to-run Colab template, and dropped the trained tokenizer on Hugging Face.

Blog โ†’ https://parveshiiii.github.io/blogs/microtok/
Trained tokenizer โ†’ Parveshiiii/microtok
GitHub repo โ†’ https://github.com/Parveshiiii/microtok
Severianย 
posted an update 2 days ago
view post
Post
4296
Iโ€™ve been working on a new mathematical approach to real-time video compositing and background removal, and I wanted to share a live demo.

Traditionally, real-time keyers either use 3D color-space bounding boxes (which struggle with semi-transparent hair and motion blur) or heavy Machine Learning models (which require massive GPU compute and often suffer from temporal "jitter" on the edges).

I wanted to see if I could solve this using purely deterministic math so it could run client-side in a standard browser.

The engine uses a custom mathematical framework I call CMT SRL SEFA. Instead of looking at raw color values or guessing semantics like an AI, it treats the video feed as complex-encoded sequences. It uses harmonic frequencies to map phase geometry and applies a "Stability Cost Function" to find the global minimum stability. In short: it isolates the foreground from the background by measuring signal complexity and structural contradictions.

Give it a try using your own messy plates and such. As I am not a VFX artist, I am curious to hear thoughts and what should be improved upon and made better

https://severian-cmt-sefa-realtime-vfx-keyer.hf.space/
  • 1 reply
ยท
MaziyarPanahiย 
posted an update 3 days ago
view post
Post
1894
We annotated 119K medical images with two frontier VLMs (Qwen 3.5, Kimi K2.5), cross-validated at 93% agreement, and produced 110K training records, all for under $500. Fine-tuning 3 small models (2-3B params) improved all benchmarks: best model reaches +15.0% average exact match.

Everything is open-sourced: datasets, adapters, and code.

https://huggingface.co/blog/OpenMed/synthvision
  • 1 reply
ยท
prithivMLmodsย 
posted an update 7 days ago
view post
Post
4367
Map-Anything v1 (Universal Feed-Forward Metric 3D Reconstruction) demo is now available on Hugging Face Spaces. Built with Gradio and integrated with Rerun, it performs multi-image and video-based 3D reconstruction, depth, normal map, and interactive measurements.

๐Ÿค— Demo: prithivMLmods/Map-Anything-v1
๐Ÿค— Model: facebook/map-anything-v1
๐Ÿค— Hf-Papers: MapAnything: Universal Feed-Forward Metric 3D Reconstruction (2509.13414)
prithivMLmodsย 
posted an update 10 days ago
view post
Post
3032
Introducing QIE-Bbox-Studio! ๐Ÿ”ฅ๐Ÿค—

The QIE-Bbox-Studio demo is now live โ€” more precise and packed with more options. Users can manipulate images with object removal, design addition, and even move objects from one place to another, all in just 4-step fast inference.

๐Ÿค— Demo: prithivMLmods/QIE-Bbox-Studio
๐Ÿ”— GitHub: https://github.com/PRITHIVSAKTHIUR/QIE-Bbox-Studio

๐Ÿš€ Models [LoRA] :

โ— QIE-2511-Object-Mover-Bbox: prithivMLmods/QIE-2511-Object-Mover-Bbox
โ— QIE-2511-Object-Remover-Bbox-v3: prithivMLmods/QIE-2511-Object-Remover-Bbox-v3
โ— QIE-2511-Outfit-Design-Layout: prithivMLmods/QIE-2511-Outfit-Design-Layout
โ— QIE-2509-Object-Remover-Bbox-v3: prithivMLmods/QIE-2509-Object-Remover-Bbox-v3
โ— QIE-2509-Object-Mover-Bbox: prithivMLmods/QIE-2509-Object-Mover-Bbox

๐Ÿš€ Collection:

โ— Qwen Image Edit [Layout Bbox]: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-layout-bbox

To learn more, visit the app page or the respective model pages.
Nymboย 
posted an update 12 days ago
view post
Post
6273
We should really have a release date range slider on the /models page. Tired of "trending/most downloaded" being the best way to sort and still seeing models from 2023 on the first page just because they're embedded in enterprise pipelines and get downloaded repeatedly. "Recently Created/Recently Updated" don't solve the discovery problem considering the amount of noise to sift through.

Slight caveat: Trending actually does have some recency bias, but it's not strong/precise enough.
  • 3 replies
ยท
OzTianluย 
posted an update 13 days ago
view post
Post
5369
Arcade-3B โ€” SmolReasoner
NoesisLab/Arcade-3B
Arcade-3B is a 3B instruction-following and reasoning model built on SmolLM3-3B. It is the public release from the ARCADE project at NoesisLab, which investigates the Stateโ€“Constraint Orthogonality Hypothesis: standard Transformer hidden states conflate factual content and reasoning structure in the same subspace, and explicitly decoupling them improves generalization.
  • 5 replies
ยท
prithivMLmodsย 
posted an update 13 days ago
view post
Post
5023
QIE-2509-Object-Remover-Bbox-v3 is a more stable version of the Qwen Image Edit visual groundingโ€“based object removal model. The app was previously featured in HF Spaces of the Week and is now updated with the latest Bbox-v3 LoRA adapter.

๐Ÿค— Demo: prithivMLmods/QIE-Object-Remover-Bbox
๐Ÿค— LoRA: prithivMLmods/QIE-2509-Object-Remover-Bbox-v3
๐Ÿค— Collection: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-layout-bbox

To learn more, visit the app page or the respective model pages.
  • 2 replies
ยท
prithivMLmodsย 
posted an update 21 days ago
view post
Post
5010
The Qwen3.5 Multimodal Understanding Demo, powered by Qwen3.5-2B, is now available on HF Spaces! It is a lightweight model designed for fast image and video reasoning. Built with Gradio, the demo showcases Image QA, Video QA, object detection, and 2D point tracking, along with real-time token streaming.

๐Ÿค— Demo: prithivMLmods/Qwen-3.5-HF-Demo
โœ… Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
๐Ÿ”— Qwen3.5-2B: Qwen/Qwen3.5-2B

To learn more, visit the app page or the respective model pages.
Ujjwal-Tyagiย 
posted an update 21 days ago
view post
Post
395
We have now LTX 2.3 with more better visual quality and richer sound, check it out! Lightricks/LTX-2.3
OzTianluย 
posted an update 22 days ago
view post
Post
1963
We deleted the Embedding Layer -- INTRO Our Collins-Embedding-3M
NoesisLab/Collins-Embedding-3M
Most "small" models are just giant vocab tables in a trench coat. Collins-3M changes that. By using 2-Universal Hashing and Chernoff-bound noise suppression, weโ€™ve collapsed the embedding space into a fixed O(1) hash-map.
* STSB: 0.7114 (Beating many 100M+ models)
* Size: 3M (Edge-ready, IoT-ready)
* Tech: Randomized Sign-Hashing + RoPE positional injection.
Built by NoesisLab
MaziyarPanahiย 
posted an update 24 days ago
view post
Post
4722
DNA, mRNA, proteins, AI. I spent the last year going deep into computational biology as an ML engineer. This is Part I of what I found. ๐Ÿงฌ

In 2024, AlphaFold won the Nobel Prize in Chemistry.

By 2026, the open-source community had built alternatives that outperform it.

That's the story I find most interesting about protein AI right now. Not just the science (which is incredible), but the speed at which open-source caught up. Multiple teams, independently, reproduced and then exceeded AlphaFold 3's accuracy with permissive licenses. The field went from prediction to generation: we're not just modeling known proteins anymore, we're designing new ones.

I spent months mapping this landscape for ML engineers. What the architectures actually are (spoiler: transformers and diffusion models), which tools to use for what, and which ones you can actually ship commercially.

New post on the Hugging Face blog: https://huggingface.co/blog/MaziyarPanahi/protein-ai-landscape

Hope you all enjoy! ๐Ÿค—
  • 2 replies
ยท
prithivMLmodsย 
posted an update 25 days ago
view post
Post
3991
QIE-Object-Remover-Bbox Demo removes objects and artifacts from selected regions using bounding box grounding. Built on Qwen-Image-Edit-2509 with Rapid Diffusers acceleration, it delivers fast 4-step inference via the QIE-2509 adapter. ๐Ÿค—๐Ÿ”ฅ

๐Ÿ”—Demo Space: prithivMLmods/QIE-Object-Remover-Bbox
๐Ÿ”—Qwen-Image-Edit-Rapid-AIO: prithivMLmods/Qwen-Image-Edit-Rapid-AIO-V4
๐Ÿ”—Adapter-(LoRA): prithivMLmods/QIE-2509-Object-Remover-Bbox

๐Ÿ”—Collection: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-layout-bbox

To learn more, visit the app page or the respective model pages.
  • 1 reply
ยท
OzTianluย 
posted an update 26 days ago
view post
Post
4779
๐Ÿ”ฅ UPGRADE in Kai: 30B Scaling! ๐Ÿ”ฅ
NoesisLab/Kai-30B-Instruct
NoesisLab/Kai-30B-Instruct
We are incredibly excited to announce that the Kai-30B-Instruct model and its official Space are now LIVE! ๐Ÿš€
If you've been following the journey from Kai-0.35B to Kai-3B, you know we're rethinking how models reason. Tired of verbose, slow Chain-of-Thought (CoT) outputs that flood your screen with self-talk? So are we.
Kai-30B-Instruct scales up our Adaptive Dual-Search Distillation (ADS) framework. By bridging classical A* heuristic search with continuous gradient descent , we use an information-theoretic log-barrier to physically prune high-entropy reasoning paths during training.
The result? Pure implicit reasoning. The model executes structured logic, arithmetic carries, and branch selections as a reflex in a single forward passโ€”no external scaffolding required.
At 3B, we observed a phase transition where the model achieved "logical crystallization". Now, at 30B, we are giving the ADS regularizer the massive representational capacity it needs to tackle higher-order symbolic abstractions and complex reasoning tasks.
๐Ÿงช Test Kai yourself in our new Space:
NoesisLab/Kai-30B-Instruct
๐Ÿ“ฆ Model Weights:
NoesisLab/Kai-30B-Instruct
Bring your hardest math, logic, and coding benchmarks. We invite the community to stress-test the limits of the penalty wall! ๐Ÿงฑ๐Ÿ’ฅ
  • 1 reply
ยท
OzTianluย 
posted an update 29 days ago
view post
Post
1727
Scaling UP in Kai! ๐ŸŒŠ
NoesisLab/Kai-3B-Instruct

Introducing NoesisLab/Kai-3B-Instruct What happens when you force a 3B model to reason entirely in its latent space ?
Meet Kai-3B, our latest industrial-grade reasoning model fine-tuned using the Adaptive Dual Search (ADS) algorithm.
GSM8K (0-shot, Direct Answer): 39.27% ๐Ÿคฏ (Llama-2-7B is ~14.6%)
HumanEval (Pass@1): 39.02% ๐Ÿ’ป (Overtakes Gemma-2-2B's 30%)
MMLU (5-shot): 53.62% ๐Ÿ“š (Crushing the 50% barrier)
ARC-Challenge: 51.88%๐ŸŽฏ
PIQA: 77.53%
HellaSwag: 69.53%
Kai-3B proves that reasoning density doesn't strictly require parameter bloat or verbose generation. It acts as a perfect, cold-blooded Agent action-engineโ€”ideal for JSON routing, SWE-bench patch generation, and anywhere you need absolute structured certainty without token waste.
  • 2 replies
ยท
OzTianluย 
posted an update about 1 month ago
view post
Post
1545
๐Ÿ›ก๏ธ Meet Spartacus-1B: Shattering the Memory Wall with True O(1) Inference! ๐Ÿš€
NoesisLab/Spartacus-1B-Instruct
NoesisLab/ChatSpartacus
At NoesisLab, we've entirely ripped out Softmax Attention and replaced it with Causal Monoid State Compression.
Say hello to Spartacus-1B-Instruct (1.3B) ๐Ÿ—ก๏ธ.
Instead of maintaining a massive, ever-growing list of past tokens, Spartacus compresses its entire causal history into a fixed-size state matrix per head. The result?
โšก True O(1) Inference: Memory footprint and generation time per token remain absolutely constant, whether you are on token 10 or token 100,000.
๐Ÿง  Explicit Causality: We threw away RoPE and attention masks. The model learns when to forget using dynamic, content-aware vector decay.
๐Ÿ”ฅ Blazing Fast Training: Full hardware utilization via our custom Triton-accelerated JIT parallel prefix scan.
๐Ÿ“Š Zero-Shot Benchmarks that Hit Hard:
O(1) architectures usually sacrifice zero-shot accuracy. Not Spartacus. It is punching way above its weight class, beating established sub-quadratic models (like Mamba-1.4B and RWKV-6-1.6B):
๐Ÿ† ARC-Challenge: 0.3063 (vs Mamba 0.284)
๐Ÿ† ARC-Easy: 0.5518
๐Ÿ† PIQA: 0.6915
prithivMLmodsย 
posted an update about 1 month ago
view post
Post
2533
FireRed-Image-Edit-1.0 (Rapid) Fast Experimental Demo is Out! ๐Ÿš€๐Ÿค—

Demo: prithivMLmods/FireRed-Image-Edit-1.0-Fast

-> Paired the EditPlusPipeline with the Diffusers-compatible transformer weights of Rapid AIO from Qwen-Image-Edit. (experimental)
-> This fusion delivers more accurate instruction following, higher image quality, and consistent visual coherence @ 4-step fast inference.
-> Better maintains text styles with high fidelity, along with high-quality old photo restoration, enhancement, and best-in-class virtual try-on.

Ujjwal-Tyagiย 
posted an update about 1 month ago
view post
Post
2899
Public reports allege that Anthropic gobbled up trillions of tokens of copyrighted material and public data to build their castle. ๐Ÿฐ๐Ÿ“„ Now that they're sitting on top, they're begging for special laws to protect their profits while pulling the ladder up behind them. ๐Ÿชœ๐Ÿšซ

But the hypocrisy meter just broke! ๐Ÿ“‰ They are accusing Chinese labs like DeepSeek, Minimax, and Kimi of "huge distillation attacks. The Reality is that You can't just loot the entire internet's library, lock the door, and then sue everyone else for reading through the window. Stop trying to gatekeep the tech you didn't own in the first place. Read the complete article on it: https://huggingface.co/blog/Ujjwal-Tyagi/the-dark-underbelly-of-anthropic
  • 3 replies
ยท