Masked Autoencoder

A self-supervised pre-training approach where random patches of an image are masked, and a ViT encoder-decoder learns to reconstruct the missing patches. MAE pre-training learns strong visual representations that transfer well to downstream tasks. In robotics, MAE-pre-trained vision encoders provide robust features for manipulation policies, especially with limited labeled robot data.

MLRepresentation LearningVision

Explore More Terms

Browse the full robotics glossary.

Back to Glossary