Knowledge Distillation

Training a smaller student model to match the outputs (soft logits) of a larger teacher model. The student learns a compressed version of the teacher's knowledge. In robotics, distillation is used to: compress large VLA models for real-time inference, transfer privileged-information teachers to vision-only students, and create efficient edge-deployable perception models.

MLTraining

Explore More Terms

Browse the full robotics glossary.

Back to Glossary