Glossary

PPO

Proximal Policy Optimization — a policy gradient RL algorithm that constrains policy updates to a trust region using a clipped surrogate objective. PPO is the default RL algorithm for robot locomotion (legged robots, humanoids) and sim-to-real transfer due to its stability, simplicity, and sample efficiency. It balances exploration and exploitation without the computational cost of TRPO's constrained optimization.

See this in practice: the Robotics Academy →

Robot LearningRL

Explore More Terms

Browse the full robotics glossary with 1,000+ terms.

Back to Glossary