Batch Size
The number of training examples processed in one forward-backward pass. Larger batch sizes provide more stable gradient estimates but require more memory and may converge to sharper minima. In robot learning, batch sizes are often limited by GPU memory, especially when training on high-resolution images. Gradient accumulation enables effective large batches on limited hardware.