Q-function (Action-Value Function)

The Q-function Q(s, a) estimates the expected cumulative discounted reward an agent will receive by taking action a in state s and then following a given policy thereafter. Q-functions are central to reinforcement learning algorithms such as DQN (discrete actions) and SAC, TD3, and DDPG (continuous actions). In robot RL, learning accurate Q-functions for long-horizon manipulation tasks is challenging because rewards are sparse and the state-action space is high-dimensional. Recent work in offline RL (IQL, CQL) uses Q-functions to extract policies from fixed datasets without online interaction, bridging the gap between imitation learning and RL.
Reinforcement LearningValue Function

Explore More Terms

Browse the full robotics glossary with 70+ terms.

Back to Glossary