Reward Function
The reward function defines the learning objective for a reinforcement learning agent: it assigns a scalar reward signal r(s, a, s') to each (state, action, next-state) transition, telling the agent how good or bad its actions are. Reward function design is one of the hardest parts of applying RL to robotics: sparse rewards (1 on success, 0 otherwise) are clean but lead to slow learning; dense rewards (e.g., negative distance to goal) guide learning but can be gamed in unexpected ways (reward hacking). Alternatives include reward learning from demonstrations (IRL, RLHF), task-specific simulation metrics, and learned preference models. Imitation learning sidesteps the reward design problem entirely by learning directly from demonstrations.