OpenVLA OFT

Open Vision-Language-Action models with fine-tuning for robotic manipulation

Overview

OpenVLA (Open Vision-Language-Action) is an open-source model for robotic manipulation that combines vision, language understanding, and action prediction. OFT enables efficient fine-tuning for specific tasks.

Key Features

Vision-language-action architecture
Efficient fine-tuning with OFT (Orthogonal Fine-Tuning)
Multi-task robot manipulation
Generalizable to new objects and environments