Vision Language Action Deployment
GitHubOpenVLAMujocoRobosuiteComputer VisionRoboticsFoundation Models
TLDR
Integrated OpenVLA model with Mujoco and Robosuite. Robot interprets text/images and outputs actions directly in action space.
Detailed
Tech Stack:
OpenVLA, Mujoco, Robosuite, Custom Position Control API
Goal:
Deploy vision language action model for robot manipulation tasks.
What I did:
- •Integrated OpenVLA with Mujoco and Robosuite simulators
- •Built custom position control API to decode OpenVLA outputs
- •Adjusted end effector coordinates, grip strength, and orientation
- •Handled tasks like picking and placing objects
What was achieved:
System outputs actions directly in action space without low level control API. Handles various manipulation scenarios using language and vision.