| | --- |
| | license: cc-by-nc-sa-4.0 |
| | tags: |
| | - robotics |
| | - vision-language-action-model |
| | - vision-language-model |
| | --- |
| | # Model Card for InternVLA-M1_spatial |
| | InternVLA-M1 is an open-source, end-to-end vision–language–action (VLA) framework for building and researching generalist robot policies. |
| | - 🌐 Homepage: [InternVLA-M1 Project Page](https://internrobotics.github.io/internvla-m1.github.io/) |
| | - 💻 Codebase: [InternVLA-M1 GitHub Repo](https://github.com/InternRobotics/InternVLA-M1) |
| | |
| | ## Training Details |
| | ``` |
| | action_chunk: 8 |
| | batch_size: 128 |
| | training_steps: 30k |
| | ``` |
| | |
| | ## Citation |
| | ``` |
| | @misc{internvla2024, |
| | title = {InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy}, |
| | author = {InternVLA-M1 Contributors}, |
| | year = {2025}, |
| | booktitle={arXiv}, |
| | } |
| | ``` |