Revanth Gundala

Compressing Robot Vision into 8 Objects

We replaced 256 visual patch tokens with 8 learned object slots and trained a robot VLA from scratch. Slot compression improved training efficiency by 11%.

Mar 5, 2026

Trying to Make a VLA Its Own Reward Model

We tried replacing SRPO's 1.1B-parameter V-JEPA with the VLA's own SigLIP encoder. Here's what we learned.

Feb 19, 2026
Twitter GitHub LinkedIn