OpenVLA OFT
Open Vision-Language-Action models with fine-tuning for robotic manipulation
View on GitHubOverview
OpenVLA (Open Vision-Language-Action) is an open-source model for robotic manipulation that combines vision, language understanding, and action prediction. OFT enables efficient fine-tuning for specific tasks.
Key Features
- Vision-language-action architecture
- Efficient fine-tuning with OFT (Orthogonal Fine-Tuning)
- Multi-task robot manipulation
- Generalizable to new objects and environments