Robotics Glossary

738+ terms covering imitation learning, VLA models, teleoperation, kinematics, and embodied AI — written for researchers, engineers, and enterprise teams.

65 terms A–Z organized Updated 2026

A

ACT (Action Chunking with Transformers)

ACT is an imitation learning algorithm introduced by Tony Zhao et al. (2023) that trains a transformer-based policy to predict a fixed-length chunk of future actions rather than a single action at each timestep. By predicting action sequences in one shot, ACT reduces the compounding error typical of step-by-step behavioral cloning and produces temporally consistent motion. The architecture encodes RGB observations and proprioceptive state through a CVAE-style encoder and decodes action chunks using a transformer. ACT was demonstrated on the ALOHA bimanual platform, achieving strong performance on tasks such as opening a bag and transferring eggs. See also: Action Chunking (deep dive).

PolicyTransformerImitation Learning

Action Space

The action space is the complete set of outputs a robot policy can produce at each timestep. For a robot arm it typically includes joint positions, joint velocities, or end-effector poses (Cartesian position + quaternion); for a mobile robot it includes wheel velocities or steering commands. Action spaces are described as either discrete (a finite menu of actions) or continuous (real-valued vectors). The dimensionality and representation of the action space strongly influences how easy it is to train a stable policy: end-effector delta-pose spaces are often easier for imitation learning, while joint-torque spaces give finer force control but require more careful normalization.

PolicyControl

ALOHA (A Low-cost Open-source Hardware System for Bimanual Teleoperation)

ALOHA is an open-source bimanual teleoperation system developed at Stanford, consisting of two ViperX 300 robot arms and two WidowX 250 leader arms mounted on a shared frame with an integrated wrist camera. It was designed to collect high-quality demonstration data at low cost — the original build is under $20,000 — and underpins the ACT policy experiments. Mobile ALOHA extends the platform with a wheeled base, enabling whole-body loco-manipulation tasks such as cooking and cleaning. ALOHA datasets are publicly available and have become a de-facto benchmark for bimanual manipulation research. Learn more at SVRC Data Services.

HardwareTeleoperationBimanual

AMR (Autonomous Mobile Robot)

An autonomous mobile robot navigates through its environment without fixed tracks or human guidance, using onboard sensors (LiDAR, cameras, IMU) combined with SLAM, path-planning, and obstacle-avoidance algorithms. Unlike AGVs (automated guided vehicles) that follow magnetic strips, AMRs build and update a map in real time and re-route dynamically around people and objects. Modern warehouse AMRs from companies like Boston Dynamics, Locus Robotics, and 6 River Systems have driven broad adoption in logistics. AMRs are often combined with manipulator arms to create mobile manipulators capable of pick-and-place at scale.

Mobile RoboticsNavigationSLAM

A* Algorithm

A graph search algorithm that finds the shortest path from a start node to a goal node using a heuristic function to guide the search. A* is optimal and complete when the heuristic is admissible (never overestimates). In robotics, A* is used for grid-based path planning, typically on occupancy grid maps produced by SLAM.

NavigationPlanning

Accelerometer

An inertial sensor that measures linear acceleration along one or more axes. MEMS accelerometers are ubiquitous in robotics IMUs, providing data for tilt estimation, vibration monitoring, and step detection. Combined with gyroscopes and magnetometers in an IMU, they enable robust orientation estimation via sensor fusion algorithms like complementary or Kalman filters.

SensorsHardware

Action Distillation

A technique where a complex teacher policy (e.g., an RL policy with privileged state access) is distilled into a simpler student policy (e.g., a visuomotor policy using only camera images). The student is trained via behavioral cloning on the teacher's rollouts. This two-stage approach is common in sim-to-real transfer: train a performant teacher in simulation with full state, then distill to a deployable vision policy.

Robot LearningTransfer Learning

Action Space Normalization

Scaling robot action values to a standard range (typically [-1, 1]) before training a policy. Normalization ensures that all action dimensions contribute equally to the loss function and prevents large-valued dimensions from dominating gradient updates. Denormalization converts policy outputs back to physical units (radians, meters/s) at execution time.

DataRobot Learning

Actuator

A device that converts energy (electrical, hydraulic, or pneumatic) into mechanical motion. In robotics, actuators drive joints and end-effectors. Common types include DC motors, stepper motors, servo motors, linear actuators, and series elastic actuators (SEAs). Actuator selection directly impacts a robot's payload, speed, precision, and compliance.

HardwareActuation

Adaptive Gripper

A gripper that mechanically adapts its finger shape to conform to the object being grasped, often through underactuation (fewer motors than DOF). Adaptive grippers can handle diverse object shapes without requiring precise grasp pose planning. Examples include the Robotiq Adaptive Gripper and FinRay-based designs.

HardwareGrasping

Admittance Control

A force control strategy where the robot reads force input and computes a motion response: the controller takes measured forces and outputs desired velocities or positions. Admittance control is the dual of impedance control — it is suited for stiff position-controlled robots (most industrial arms) that need to behave compliantly when interacting with the environment or humans.

ControlSafety

Affordance

The set of actions that an object or environment feature supports. In robotics, affordance detection identifies how objects can be grasped, pushed, opened, poured from, or otherwise manipulated. Visual affordance models predict pixel-wise or region-wise actionability from images. Understanding affordances enables robots to interact with novel objects by recognizing their functional properties.

VisionManipulation

AGV

Automated Guided Vehicle — a mobile robot that follows fixed paths (magnetic strips, painted lines, wires embedded in the floor) to transport materials in industrial settings. AGVs are simpler and cheaper than AMRs but cannot dynamically re-route. They are widely deployed in manufacturing, warehousing, and hospital logistics for repetitive transport tasks.

Mobile RoboticsIndustrial

Antipodal Grasp

A grasp where two contact points are positioned such that the contact normals point in opposite directions (antiparallel). Antipodal grasps with friction cones that overlap the line between contacts are guaranteed to achieve force closure. They are the most common grasp type for parallel-jaw grippers and the basis for most analytical grasp planning algorithms.

ManipulationGrasping

Autoregressive Policy

A policy architecture that generates multi-dimensional actions one dimension at a time, conditioning each subsequent dimension on the previously generated ones. Autoregressive policies can model complex, multi-modal action distributions more faithfully than Gaussian policies. The trade-off is slower inference due to sequential generation. Used in BeT (Behavior Transformers) and some diffusion policy variants.

Robot LearningPolicy

Activation Function

A nonlinear function applied element-wise to the output of a neural network layer. Common activations include ReLU, GELU, sigmoid, tanh, and SiLU/Swish. The choice of activation affects training dynamics, gradient flow, and representational capacity. GELU is standard in transformers; ReLU and its variants dominate convolutional architectures.

MLArchitecture

Adam Optimizer

An adaptive learning rate optimization algorithm that maintains per-parameter first and second moment estimates of gradients. Adam combines the benefits of AdaGrad (per-parameter learning rates) and RMSProp (exponential moving average of squared gradients). It is the default optimizer for training most robot learning models, with AdamW (decoupled weight decay) being the most common variant.

MLTraining

Attention Mechanism

A neural network component that computes weighted combinations of value vectors based on the compatibility between query and key vectors. Self-attention (used in transformers) allows each position in a sequence to attend to all other positions, capturing long-range dependencies. Cross-attention enables one sequence to attend to another (e.g., language tokens attending to visual features).

MLTransformer

Autoencoder

A neural network trained to compress input data into a low-dimensional latent representation (encoder) and reconstruct the original data from this representation (decoder). Variational autoencoders (VAEs) add a probabilistic structure to the latent space. In robotics, autoencoders learn compact state representations from high-dimensional observations for efficient policy learning.

MLRepresentation Learning

ABB IRB

A family of industrial robot arms by ABB, one of the 'Big Four' robot manufacturers. IRB models range from small assembly robots (IRB 120, 3kg payload) to heavy-duty handlers (IRB 7600, 500kg). ABB robots use the RAPID programming language and IRC5 controller. They are widely deployed in automotive, electronics, and food manufacturing.

HardwareIndustrial

Allegro Hand

A 16-DOF robotic hand with four fingers (4 joints each), designed for dexterous manipulation research. The Allegro Hand is widely used in academic labs for in-hand manipulation, RL-trained dexterity, and tactile manipulation experiments. It provides torque sensing at each joint and can be controlled at 333 Hz. Made by Wonik Robotics.

HardwareDexterous

ANYmal

A quadruped robot developed by ETH Zurich / ANYbotics for industrial inspection in hazardous environments. ANYmal features torque-controlled joints (SEAs), multiple sensor payloads, and IP67 protection. It can climb stairs, navigate rough terrain, and perform autonomous inspection missions. ANYmal has been a key platform for RL-based locomotion research.

HardwareLocomotion

Atlas

Boston Dynamics' bipedal humanoid robot, featuring 28 hydraulic actuators and advanced whole-body control. Atlas demonstrates highly dynamic movements: backflips, parkour, and dancing. The electric version (2024) uses electric actuators with a broader range of motion. Atlas represents the state of the art in dynamic bipedal locomotion.

HardwareHumanoidLocomotion

A3C

Asynchronous Advantage Actor-Critic — a deep RL algorithm that trains multiple agent instances in parallel on separate environment copies, asynchronously updating a shared policy network. A3C was one of the first algorithms to achieve superhuman performance on Atari games and demonstrated that parallel data collection can replace experience replay for stable training.

Agricultural Robot

A robot designed for farming tasks: planting, weeding, harvesting, spraying, monitoring crop health, and soil sampling. Agricultural robots must operate in unstructured outdoor environments with variable weather, terrain, and crop conditions. Key technologies include GPS-guided navigation, computer vision for crop identification, and gentle manipulation for harvesting.

ApplicationsMobile Robotics

Autonomous Vehicle

A vehicle capable of navigating and driving without human input, using sensors (cameras, LiDAR, radar), perception algorithms, and planning systems. Autonomous vehicles are classified by SAE levels (L0-L5). While distinct from traditional robotics, AVs share core technologies: SLAM, object detection, motion planning, and sensor fusion.

ApplicationsNavigation

Adaptive Control

A control strategy that adjusts its parameters in real time based on observed system behavior, without requiring an accurate a-priori model. Adaptive controllers are useful when robot dynamics change over time — due to varying payloads, wear, or configuration changes. Model Reference Adaptive Control (MRAC) and Self-Tuning Regulators (STR) are the two main frameworks.

Control

ArUco Markers

A family of fiducial markers (printed square patterns with unique binary codes) used for camera pose estimation and object tracking in robotics. ArUco markers are detected by OpenCV with sub-pixel corner accuracy, enabling precise camera-to-marker transform estimation. Used for robot calibration, coordinate frame alignment, and augmented reality overlays in robot systems.

VisionCalibration

Approach Direction

The direction from which the robot's end-effector approaches an object before making contact. Approach direction determines grasp quality: approaching from above (top grasp) vs. from the side (side grasp) yields different contact geometries and force distributions. Collision-free approach directions must be planned to avoid obstacles near the object.

ManipulationGrasping

Aerospace Robot

A robot designed for aerospace manufacturing or maintenance: drilling, riveting, painting, inspection, and composite layup on aircraft fuselages. Aerospace robots must achieve tight tolerances (±0.1mm) on large workpieces (up to 50m) and handle composite materials. Gantry robots, collaborative systems, and mobile platforms are all used in aerospace facilities.

ApplicationsIndustrial

Automated Guided Vehicle

A materials-handling vehicle following fixed paths (magnetic tape, laser reflectors, QR codes) to transport loads in factories and warehouses. AGVs were the precursor to modern AMRs. They offer predictable, repeatable transport but cannot adapt to layout changes. Modern hybrid vehicles combine AGV reliability with AMR flexibility.

ApplicationsIndustrialMobile Robotics

Automated Storage and Retrieval

A system of automated cranes and racks for high-density storage and retrieval of goods in warehouses. ASRS robots move in 3D: horizontally along aisles, vertically along racks, and into rack depth. They achieve dense storage utilization (3-4× manual warehouses) and continuous operation. Integration with WMS (Warehouse Management Systems) enables inventory tracking.

ApplicationsIndustrial

Adversarial Robustness

The ability of a robot perception or policy system to maintain correct behavior under deliberately crafted adversarial inputs (sensor perturbations, manipulated images, physical adversarial patches). Adversarial robustness is a safety concern for autonomous systems deployed in environments where malicious actors may attempt to fool the robot's perception.

Robot LearningSafetyVision

B

Behavioral Cloning (BC)

Behavioral cloning is the simplest form of imitation learning: a supervised regression problem where the policy is trained to mimic expert demonstrations by minimizing the prediction error between the policy's output and the expert's action at each observed state. BC is easy to implement and scales well with data, but suffers from distributional shift — because it never receives corrective feedback, small errors cause the robot to visit states not present in the training data, which can cascade into task failure. Techniques like DAgger (Dataset Aggregation) and GAIL were developed specifically to address BC's compounding-error problem.

Imitation LearningSupervised Learning

Bimanual Manipulation

Bimanual manipulation refers to tasks that require two robot arms working in coordination, analogous to how humans use both hands simultaneously. Examples include folding laundry, tying knots, opening jars, and assembling parts that must be stabilized by one hand while the other performs fine operations. Bimanual tasks are substantially harder than single-arm tasks because the policy must coordinate two high-dimensional action streams while respecting physical constraints between the arms. The ALOHA platform was purpose-built for collecting bimanual demonstrations, and ACT is among the leading policies for bimanual control.

ManipulationHardware

BOM (Bill of Materials)

In robotics hardware, the BOM lists every component, subassembly, part number, quantity, and unit cost required to build a system. Accurate BOMs are critical for production scaleup, procurement, supply-chain risk management, and cost modeling. For open-source robot platforms such as OpenArm or ALOHA, a published BOM allows external teams to reproduce the hardware without proprietary dependencies. Enterprise teams evaluating robot deployment often request a BOM to benchmark total cost of ownership against lease or robot-as-a-service alternatives — compare SVRC leasing options.

HardwareManufacturing

Backdrivable Actuator

An actuator whose output shaft can be moved by external forces without energizing the motor. Backdrivability is desirable in collaborative robots because it allows safe physical human-robot interaction — if a person pushes the arm, it yields rather than resisting. Series elastic actuators and direct-drive motors are inherently backdrivable; high-ratio gearboxes are typically not.

HardwareSafetyActuation

Behavior Transformer

A transformer-based policy architecture (BeT) that discretizes the continuous action space into clusters and uses an autoregressive transformer to predict action token sequences. BeT can represent multi-modal action distributions — when multiple valid actions exist for the same observation — which is a key advantage over MSE-based behavioral cloning that averages over modes.

Robot LearningTransformer

Behavior Tree

A hierarchical task-switching architecture that organizes robot behaviors as a tree of nodes: sequences (execute children in order), selectors (try children until one succeeds), decorators (modify child behavior), and action/condition leaves. Behavior trees are more modular and maintainable than finite state machines for complex robot behaviors. They are widely used in game AI and increasingly in robotics.

SoftwarePlanning

Bin Picking

The task of picking individual objects from a bin or container filled with randomly arranged items — a classic unstructured manipulation challenge. Bin picking requires robust perception (dealing with clutter and occlusion), grasp planning (finding collision-free grasps), and motion planning. It is one of the highest-volume industrial applications of robotic manipulation.

ManipulationIndustrial

Bipedal Robot

A robot that walks on two legs, like a human. Bipedal robots face the challenging balance problem of maintaining stability on a small support polygon. Control approaches include ZMP (Zero Moment Point), capture point, and RL-based whole-body control. Examples: Atlas (Boston Dynamics), Digit (Agility), ARTEMIS, Unitree H1. Bipedal locomotion enables navigation in human-built environments.

LocomotionHumanoid

Brushless DC Motor

A synchronous electric motor driven by DC power through an electronic commutation circuit rather than mechanical brushes. BLDC motors offer higher efficiency, lower maintenance, and better torque-to-weight ratio than brushed motors. They are the dominant actuator in modern robot arms, drones, and mobile bases. Field-oriented control (FOC) is the standard drive algorithm.

HardwareActuation

Backpropagation

The algorithm for computing gradients of a loss function with respect to neural network parameters by applying the chain rule layer by layer from output to input. Backpropagation enables gradient-based optimization of deep networks. It requires storing intermediate activations during the forward pass, which dominates memory consumption during training.

MLTraining

Batch Normalization

A technique that normalizes layer inputs across the mini-batch to have zero mean and unit variance, then applies a learned affine transformation. Batch normalization stabilizes training, enables higher learning rates, and acts as a regularizer. It is standard in CNNs but less common in transformers, which typically use layer normalization instead.

MLArchitecture

Batch Size

The number of training examples processed in one forward-backward pass. Larger batch sizes provide more stable gradient estimates but require more memory and may converge to sharper minima. In robot learning, batch sizes are often limited by GPU memory, especially when training on high-resolution images. Gradient accumulation enables effective large batches on limited hardware.

MLTraining

BERT

Bidirectional Encoder Representations from Transformers — a pre-trained language model that learns contextual text representations by masking tokens and predicting them from surrounding context. In robotics, BERT-style models encode natural language instructions for language-conditioned policies. SentenceBERT embeddings are used for task retrieval and instruction similarity matching.

MLVision-Language

Baxter

A dual-arm collaborative robot by Rethink Robotics, featuring series elastic actuators, a screen-based 'face', and two 7-DOF arms. Baxter was designed for flexible manufacturing and was one of the first commercial cobots. Though Rethink Robotics closed in 2018, Baxter remains widely used in research labs for manipulation experiments.

HardwareManipulation

Boston Dynamics Spot

A quadruped robot by Boston Dynamics designed for industrial inspection, remote monitoring, and data collection. Spot weighs 32kg, carries 14kg payload, operates for 90 minutes, and navigates stairs, rough terrain, and confined spaces. It features an arm attachment for manipulation and is commercially deployed in construction, energy, and mining sectors.

HardwareLocomotionIndustrial

Backstepping

A recursive Lyapunov-based control design technique for nonlinear systems in strict-feedback form. Starting from an inner subsystem and working outward, each step designs a virtual control law and adds a stabilizing term. Backstepping guarantees stability via Lyapunov functions and is used for underactuated robots and nonlinear trajectory tracking.

Control

Background Subtraction

A video processing technique that separates foreground objects from a static or slowly changing background by modeling the background appearance and flagging pixels that deviate significantly. In robot pick-and-place on conveyor belts or fixed work surfaces, background subtraction provides fast, lightweight object detection without deep learning inference.

Vision

Bundle Adjustment

A nonlinear least-squares optimization that jointly refines camera poses and 3D point positions to minimize reprojection errors across multiple images. Bundle adjustment is the core optimization in structure-from-motion (SfM) and SLAM. COLMAP and g2o implement efficient sparse bundle adjustment. It is the gold standard for accurate 3D reconstruction.

VisionSLAM

Balance Recovery

The ability of a legged robot to recover from perturbations (pushes, terrain irregularities) that threaten to cause a fall. Balance recovery controllers detect instability via foot contact forces and CoM velocity and execute corrective motions (stepping, trunk tilting). RL-trained recovery policies generalize to diverse perturbation types without hand-crafted rules.

Locomotion

Ball Bearing

An antifriction bearing that uses balls to maintain separation between inner and outer rings, providing low rolling friction and high rotational speed capability. Ball bearings are the standard in robot joint design, enabling free rotation with minimal torque loss. Deep groove ball bearings handle radial and axial loads; angular contact bearings support combined loads.

Hardware

Belt Drive

A power transmission system using a flexible belt running over pulleys to transfer rotary motion between shafts. Toothed (timing) belts provide positive engagement without slipping; flat and V-belts rely on friction. Belt drives are used in robot joints and linear actuators to achieve gear ratios, transmit power over distance, and reduce vibration.

HardwareActuation

Brushes (Motor)

Conductive graphite contacts that slide against a rotating commutator to transfer current in brushed DC motors. Brushes wear over time and require periodic replacement — a key maintenance concern for high-cycle industrial robots. Brushless motors eliminate brushes, dramatically increasing service life and enabling higher speeds.

HardwareActuation

Bayesian Optimization

A sequential optimization strategy for expensive black-box functions that builds a probabilistic surrogate model (Gaussian process) of the objective, then selects the next evaluation point by maximizing an acquisition function (EI, UCB). In robotics, Bayesian optimization is used for tuning controller gains, reward function parameters, and hyperparameters with minimal evaluations.

MathMLControl

Bezier Curve

A parametric curve defined by control points, commonly used for smooth trajectory generation. Quadratic and cubic Bezier curves are the most common: they guarantee smooth interpolation through endpoints with derivative boundary conditions. In robot path planning, Bezier curves generate smooth, differentiable trajectories for Cartesian and joint space motion.

MathPlanning

Bspline

B-spline (Basis Spline) — a piecewise polynomial curve defined by control points and a knot vector, providing local control (moving one control point only affects nearby curve segments). B-splines are used in robot trajectory planning for their numerical stability, smoothness, and ability to satisfy velocity and acceleration boundary conditions.

MathPlanning

Battery Swapping

Replacing a depleted battery module with a fully charged one — an alternative to tethered charging that provides near-instant energy replenishment for mobile robots. Battery swapping infrastructure maintains uptime in round-the-clock warehouse operations where robots can't afford to wait for multi-hour charge cycles. Standardized battery interfaces enable multi-vendor compatibility.

HardwareApplicationsMobile Robotics

Behavioral Regularization

A family of offline RL methods that constrain the learned policy to remain close to the behavioral policy that collected the data, preventing exploitation of out-of-distribution actions. Methods include: policy constraint (TD3+BC), KL divergence penalty, and support constraint (BEAR). Behavioral regularization is the key mechanism enabling stable offline RL.

Robot LearningRL

C

Cartesian Space (Task Space)

Cartesian space (also called task space or operational space) describes a robot's configuration in terms of the position and orientation of its end-effector relative to a world or base frame, typically expressed as (x, y, z, roll, pitch, yaw) or (x, y, z, quaternion). Controlling a robot in Cartesian space is often more intuitive for imitation learning because human demonstrations map naturally to end-effector trajectories. The transformation from joint space to Cartesian space is called forward kinematics; the inverse is inverse kinematics.

KinematicsControl

Co-training

Co-training in robotics refers to training a single policy on data from multiple robot embodiments, tasks, or environments simultaneously. The hypothesis is that diverse data sources teach the policy robust visual and behavioral representations that transfer better to new settings. The Open X-Embodiment dataset was assembled specifically to enable co-training across more than 22 robot types. Large foundation models like RT-2 and OpenVLA rely on co-training with internet-scale vision-language data alongside robot demonstration data to bootstrap generalization.

TrainingGeneralizationFoundation Model

Contact-rich Manipulation

Contact-rich manipulation tasks are those where purposeful, sustained contact between the robot and environment is essential to task success — such as peg-in-hole insertion, screwing bolts, folding fabric, or kneading dough. These tasks are challenging because small positional errors produce large force spikes, and stiff position controllers can damage parts or destabilize the robot. Successful approaches combine compliant control (impedance or admittance control), force-torque sensing, and learned policies that anticipate and exploit contact.

ManipulationControlForce Sensing

Continuous Control

Continuous control refers to robot policies that output real-valued action vectors (e.g., joint torques, velocities, or Cartesian deltas) rather than selecting from a discrete set of actions. Most physical robot manipulation tasks require continuous control because smooth, precise motion cannot be adequately represented by a finite action menu. Standard deep RL algorithms for continuous control include DDPG, TD3, and SAC; for imitation learning, behavioral cloning and Diffusion Policy are commonly used in continuous action spaces.

ControlReinforcement Learning

Cable Manipulation

Manipulating deformable linear objects (cables, wires, ropes, hoses) — a challenging domain because deformable objects have infinite-dimensional state spaces and complex physics. Tasks include routing cables through clips, untangling, and connector insertion. Cable manipulation is important for manufacturing (wiring harnesses) and domestic robotics (cable management).

Manipulation

Calibration

The process of determining and correcting systematic errors in sensors, actuators, or geometric relationships. In robotics, key calibrations include: camera intrinsic calibration (focal length, distortion), hand-eye calibration (camera-to-robot transform), kinematic calibration (DH parameter correction), and force-torque sensor zeroing. Accurate calibration is a prerequisite for precision manipulation and measurement.

CalibrationSensors

Camera Extrinsics

The 6-DOF pose (rotation + translation) of a camera relative to a reference frame — typically the robot base or world frame. Extrinsic calibration is required for hand-eye coordination: the robot must know where each camera is to transform visual observations into actionable coordinates. Eye-in-hand and eye-to-hand are the two standard configurations.

SensorsVisionCalibration

Camera Intrinsics

The internal parameters of a camera that define the mapping from 3D camera coordinates to 2D pixel coordinates: focal length (fx, fy), principal point (cx, cy), and lens distortion coefficients. Accurate intrinsics are essential for depth estimation, 3D reconstruction, and visual servoing. They are determined through camera calibration, typically using a checkerboard pattern and OpenCV.

SensorsVision

Capstan Drive

A cable-driven transmission that wraps a cable around a drum to achieve high reduction ratios with zero backlash and high backdrivability. Capstan drives are used in teleoperation master devices and dexterous hands where smooth, low-friction force transmission is critical. The trade-off is limited range of motion and cable wear over time.

HardwareActuation

Cartesian Impedance Control

Impedance control applied in Cartesian (task) space rather than joint space, allowing the robot end-effector to behave as a virtual mass-spring-damper system with programmable stiffness and damping in each Cartesian direction. This is the standard control mode for contact-rich manipulation on torque-controlled arms like the Franka Panda.

ControlManipulation

Cartesian Robot

A robot with three linear (prismatic) axes arranged orthogonally, creating a rectangular workspace. Also called gantry robots. They offer high rigidity, precision, and payload capacity along linear paths. Common in CNC machining, 3D printing, pick-and-place, and packaging. Their workspace is a rectangular prism defined by the travel range of each axis.

HardwareIndustrial

Causal Confusion

A failure mode in imitation learning where the policy learns to rely on spurious correlations in the demonstration data rather than the true causal features. For example, a driving policy might learn that the brake light indicator predicts stopping, rather than learning to stop based on traffic conditions. Causal confusion often manifests as policies that work in-distribution but fail unpredictably on new scenarios.

Robot LearningImitation Learning

CE Marking Robotics

The conformity marking required for products sold in the European Economic Area, indicating compliance with EU health, safety, and environmental protection directives. For robots, CE marking requires compliance with the Machinery Directive (2006/42/EC), EMC Directive, and potentially the Radio Equipment Directive. Risk assessment per ISO 12100 is a prerequisite.

SafetyStandards

Center of Mass

The average position of all mass in a body or system, weighted by mass. For legged robots, keeping the center of mass (CoM) projection within the support polygon is the fundamental balance criterion for static stability. Dynamic gaits allow the CoM to leave the support polygon temporarily, relying on inertial effects to maintain balance.

KinematicsDynamics

CLIP

Contrastive Language-Image Pre-training — a model trained by OpenAI on 400M image-text pairs to learn aligned visual and linguistic representations. CLIP embeddings are used in robotics for open-vocabulary object detection, language-conditioned manipulation, and reward specification. VLA models like RT-2 and SayCan leverage CLIP-style vision-language alignment to ground language commands in robotic actions.

Robot LearningVision-Language

Cloth Manipulation

Manipulating fabric and garments — one of the hardest manipulation domains due to cloth's high-dimensional deformable state, self-occlusion, and complex dynamics. Tasks include folding, unfolding, spreading, hanging, and dressing assistance. Learning-based methods (using simulation with deformable-body physics) are the current approach, but sim-to-real transfer for cloth remains challenging.

Manipulation

Co-Training Multi-Robot

Training a single generalist policy on demonstration data collected from multiple different robot embodiments (different morphologies, action spaces, and sensor configurations). The hypothesis is that cross-embodiment data teaches more robust representations. The Open X-Embodiment dataset and RT-X models are the flagship examples, aggregating data from 22+ robot types.

Robot LearningTransfer Learning

COBOTS

Collaborative robots designed to work safely alongside humans without safety cages. Cobots feature force-limited joints, rounded edges, low mass, and compliant control. They are used for tasks where human flexibility and robot precision complement each other: assembly assist, machine tending, quality inspection. Market leaders include Universal Robots, Franka Emika, and ABB GoFa.

HardwareSafetyIndustrial

Collaborative Workspace

The space where a human and a collaborative robot operate simultaneously during production. ISO/TS 15066 defines safety requirements for collaborative workspaces, including maximum allowable forces and pressures for transient and quasi-static contact between the robot and different body regions. Workspace design must ensure that collision forces remain below injury thresholds.

SafetyStandards

Compliance

The inverse of stiffness — a measure of how much a mechanical system deforms under applied force. In robotics, compliance can be passive (physical springs, soft materials) or active (control-programmed virtual stiffness). Appropriate compliance is essential for safe human interaction, assembly tasks, and manipulation of fragile objects.

ControlSafety

Compliant Actuator

An actuator that intentionally introduces mechanical compliance (elasticity) between the motor and the output link. Series elastic actuators (SEAs) and variable stiffness actuators (VSAs) are the two main classes. Compliance allows accurate force sensing, shock absorption, and safer interaction with humans and unstructured environments.

HardwareActuationSafety

Computed Torque Control

A model-based control strategy that uses the robot's inverse dynamics model to compute the joint torques required for a desired trajectory, then adds PD feedback to correct tracking errors. It linearizes the nonlinear robot dynamics, enabling the use of linear control design. Accuracy depends on the quality of the dynamics model (mass, inertia, friction parameters).

Control

Conditional Imitation Learning

An extension of behavioral cloning where the policy is conditioned on high-level commands or goals in addition to observations. For example, a navigation policy might take a goal image or a language instruction as input alongside the current camera frame. This enables a single policy to execute multiple behaviors depending on the conditioning signal.

Robot LearningImitation Learning

Contact Simulation

The numerical simulation of physical contact between rigid or deformable bodies, including collision detection, contact force computation, and friction modeling. Contact simulation is the hardest part of physics simulation for robotics — small errors in contact models cause large sim-to-real gaps. Approaches include penalty methods, LCP (linear complementarity problem), and compliant contact models.

Simulation

Contact-Rich Manipulation

Manipulation tasks involving sustained, complex contact between the robot and objects or the environment. Examples include insertion, pivoting, pushing, screwing, and wiping. Contact-rich tasks require force-aware control and are poorly suited to purely position-controlled approaches. Diffusion policies and impedance control have shown strong results on contact-rich benchmarks.

ManipulationControl

Contrastive Learning

A self-supervised representation learning approach that trains an encoder to produce similar embeddings for semantically related inputs (positive pairs) and dissimilar embeddings for unrelated inputs (negative pairs). In robotics, contrastive learning is used to learn state representations from unlabeled video, train reward models, and pre-train visual encoders before policy fine-tuning.

Robot LearningRepresentation Learning

Control Barrier Function

A mathematical function that certifies forward invariance of a safe set — ensuring the robot stays within safe operating bounds. CBFs provide formal safety guarantees that can be integrated with any nominal controller by solving a QP (quadratic program) that minimally modifies the control input to maintain safety. They are used for collision avoidance, joint limit enforcement, and safe exploration.

SafetyControl

Coriolis Forces

Velocity-dependent forces that arise in multi-link robot dynamics due to the coupling between joint motions. They appear in the robot's equation of motion as the C(q, dq) matrix multiplied by the joint velocity vector. Coriolis forces are significant at high joint speeds and must be compensated in model-based controllers like computed torque. They vanish when the robot is stationary.

Dynamics

Costmap

A 2D grid representation where each cell stores a cost value indicating how desirable (or dangerous) it is for the robot to traverse that location. Costmaps are used in mobile robot navigation — cells occupied by obstacles have infinite cost, cells near obstacles have high cost (inflation), and free space has low cost. ROS navigation stacks use layered costmaps.

Navigation

Cross-Embodiment Transfer

Transferring learned skills or representations between robots with different physical forms (embodiments). A policy trained on a UR5 arm may partially transfer to a Franka Panda despite different kinematics and dynamics. Foundation models like Octo and RT-X are designed to facilitate cross-embodiment transfer by training on data from diverse robot types.

Robot LearningTransfer Learning

Curriculum Learning

A training strategy that presents tasks or environments to the learner in an ordered sequence of increasing difficulty, analogous to how humans learn. In robot RL, curriculum learning might start with simplified tasks (objects near the gripper, low randomization) and progressively increase difficulty (distant objects, heavy domain randomization). Automatic curriculum generation methods (like PAIRED or PLR) are an active research area.

Robot LearningRL

Cycle Time

The total time to complete one cycle of a robotic operation, from the start of one unit to the start of the next. In manufacturing, cycle time determines throughput and is the key metric for production line design. Reducing cycle time involves optimizing motion paths, reducing idle time, and increasing speeds within safety and quality constraints.

IndustrialManufacturing

CNN

Convolutional Neural Network — an architecture that applies learnable convolutional filters to detect spatial patterns (edges, textures, objects) in images. CNNs are the foundation of visual perception in robotics: object detection, segmentation, depth estimation, and visual feature extraction for manipulation policies. ResNet, EfficientNet, and ConvNeXt are popular CNN architectures.

MLVision

Cosine Annealing

A learning rate schedule that decreases the learning rate following a cosine curve from an initial value to near zero over the training period. Cosine annealing with warm restarts (SGDR) periodically resets the learning rate, which can help escape local minima. It is the default schedule for training vision transformers and many robot learning models.

MLTraining

Cross-Entropy Loss

A loss function measuring the difference between predicted probability distributions and true labels. For classification, it is the negative log-likelihood of the correct class. In robot learning, cross-entropy is used for discrete action prediction, token generation in VLA models, and training language-conditioned policies where the action space is discretized.

MLTraining

CALVIN

A benchmark for evaluating long-horizon language-conditioned manipulation in simulation. CALVIN provides a robotic tabletop environment with 34 manipulation tasks (push, slide, open, close, pick, place) conditioned on natural language instructions. Policies are evaluated on their ability to chain multiple tasks sequentially based on language goals.

BenchmarkRobot Learning

COLOSSEUM

A simulation benchmark that systematically evaluates robot manipulation policies under 14 types of environmental perturbations: lighting changes, table texture, distractor objects, camera position shifts, and more. COLOSSEUM tests the robustness and generalization of learned policies by measuring performance degradation under each perturbation type.

BenchmarkRobot Learning

CQL

Conservative Q-Learning — an offline RL algorithm that learns a conservative (pessimistic) Q-function by adding a penalty for Q-values on out-of-distribution actions. This prevents the overestimation problem that causes offline RL policies to exploit spurious high-value regions not supported by the data. CQL is one of the most widely used offline RL methods for robot manipulation.

Clean Room Robot

A robot designed for operation in controlled environments (semiconductor fabs, pharmaceutical manufacturing) where particle contamination must be minimized. Clean room robots use special lubricants, sealed joints, and electrostatic dissipation to meet ISO cleanroom classifications (ISO 1-8). They handle wafers, substrates, and vials with extreme precision.

ApplicationsIndustrial

Construction Robot

A robot designed for construction tasks: bricklaying, welding, concrete pouring, rebar tying, site surveying, and demolition. Construction robots must operate in dynamic, unstructured environments alongside human workers. Key challenges include outdoor perception, heavy payload manipulation, and adaptation to variable site conditions.

Applications

Cascade Control

A multi-loop control structure where the output of an outer (primary) controller provides the setpoint for an inner (secondary) controller. In robot arm control: the position loop (outer) commands a velocity setpoint to the velocity loop (inner), which commands torque to the torque loop (innermost). Cascade control provides faster disturbance rejection than a single loop.

Control

Contraction Theory

A framework for analyzing the stability of nonlinear systems by showing that the distance between any two trajectories converges to zero over time. A system is contracting if its Jacobian is uniformly negative definite. Contraction theory provides a modular approach to stability analysis and has been applied to robot control and learning-based controllers.

ControlMath

Control Lyapunov Function

A function V(x) > 0 for which there exists a control input u that makes V̇(x) ≤ 0, guaranteeing the system converges to the origin. CLFs are used to synthesize stabilizing controllers formally. Combined with Control Barrier Functions in a single QP, CLF-CBF controllers simultaneously guarantee stability and safety — a key framework for safe robot control.

ControlSafety

Camera Calibration

The process of determining camera intrinsic parameters (focal length, principal point, distortion coefficients) using images of a known calibration target (checkerboard, ChArUco, circle grid). Zhang's method (OpenCV's calibrateCamera) is the standard algorithm. Accurate calibration is prerequisite for 3D reconstruction, depth estimation, and visual servoing.

VisionCalibration

Color Histogram

A representation of the distribution of colors in an image by counting pixels in each color bin. Color histograms are used for object tracking (CamShift, MeanShift), image retrieval, and simple object recognition based on color statistics. They are compact, fast to compute, and viewpoint-invariant but sensitive to lighting changes.

Vision

Contour Detection

Finding the boundaries of objects in a binary or edge image by tracing connected sets of edge pixels. OpenCV's findContours algorithm extracts contour hierarchies. Contours provide shape descriptors (area, perimeter, moments, convex hull) for object recognition and grasping. Contour-based methods are fast and interpretable alternatives to deep learning for structured environments.

Vision

Correspondence Problem

The fundamental problem in stereo and optical flow: finding matching pixels or features in two or more images that correspond to the same 3D point. Solving correspondence enables depth estimation from stereo cameras and motion estimation from sequential frames. Deep learning approaches (RAFT, StereoNet) have largely replaced hand-crafted matching costs for dense correspondence.

Vision

Capture Point

The point on the ground where a robot must place its next foot to come to a standstill without falling. The Divergent Component of Motion (DCM) framework generalizes the capture point concept to stepping sequences. Capture-point-based controllers generate footstep plans that maintain balance in the presence of external pushes.

LocomotionControl

Central Pattern Generator

A neural oscillator network (biological or artificial) that generates rhythmic locomotion patterns without sensory feedback. CPG-based controllers produce robust, energy-efficient gaits for legged robots. CPG parameters (frequency, amplitude, phase coupling) can be tuned by an outer controller or adapted via RL to the terrain and speed.

LocomotionControl

Contact Schedule

A plan specifying when each foot (or end-effector) of a multi-contact robot is in contact with the environment. Contact schedules define the gait pattern (trot, gallop, walk) and the stance/swing sequence. Trajectory optimization methods jointly optimize the contact schedule and the motion trajectory for energy efficiency and robustness.

LocomotionPlanning

Compliant Grasp

A grasp executed with controlled compliance — the gripper yields to contact forces rather than rigidly resisting them. Compliant grasping reduces the required accuracy of grasp pose estimation by accommodating small positioning errors through mechanical or control compliance. It is essential for grasping fragile objects without damaging them.

ManipulationGraspingControl

Contact Mode

The discrete characterization of a contact event: no contact, rolling, sliding, or sticking. Contact modes define the constraints active at each contact point and determine the feasible manipulations. Contact mode transitions (e.g., rolling to sliding) require coordinated control of contact forces. Hybrid system theory models contact-rich manipulation as a sequence of contact modes.

ManipulationControl

Crop Harvesting

Robotic harvesting of agricultural produce (strawberries, tomatoes, apples, cucumbers) — one of the hardest unstructured manipulation tasks. Harvesting robots must detect ripe fruit (color, texture), estimate 3D pose, navigate to the fruit, grasp delicately to avoid damage, and sever the stem. Variable lighting, occlusion, and deformable produce make this extremely challenging.

ManipulationApplications

Cable Harness

An organized assembly of wires, cables, and connectors bundled together for robot wiring. Cable harnesses route power and signals through the robot structure while managing cable routing, strain relief, and interference. In collaborative robots, internal cable routing through hollow links protects cables and provides clean aesthetics.

Hardware

Carbon Fiber Composite

A lightweight, high-strength structural material widely used in robot arms, drone frames, and exoskeleton links. Carbon fiber reinforced polymer (CFRP) offers a strength-to-weight ratio 5× higher than steel. Its high stiffness minimizes structural deformation under load, improving positioning accuracy. Anisotropic properties require careful fiber orientation design.

HardwareMaterials

Collet

A precision clamping device that grips a cylindrical workpiece or tool shank by contracting around it when a nut is tightened. Collets provide high clamping force and concentricity (runout < 0.01mm). They are used in robot tool changers, CNC machining, and precision assembly to mount end-effectors, sensors, or tools on robot wrists.

Hardware

Cooling System

A system that removes heat generated by motors, electronics, and power amplifiers in robot systems. Cooling methods include: passive (heat sinks, thermal pads), forced air (fans), liquid cooling (water jackets, heat pipes), and thermoelectric cooling. Thermal management is critical for maintaining motor winding temperature below limits and ensuring electronics reliability.

Hardware

Cross-Roller Bearing

A precision bearing where cylindrical rollers are arranged alternately at 90° to each other between inner and outer rings, providing high rigidity and accuracy under combined axial, radial, and moment loads in a compact profile. Cross-roller bearings are standard in robot joint designs (especially collaborative robots and SCARA robots) requiring high tilting moment stiffness.

Hardware

CAN Bus

Controller Area Network — a robust serial communication protocol designed for real-time automotive and industrial applications. CAN bus is widely used in robotics for connecting motor controllers, sensors, and microcontrollers with 1 Mbps speed, collision detection, and error correction. Widely used in research robots (MIT Cheetah, Boston Dynamics) for actuator communication.

SoftwareHardwareElectronics

Conjugate Gradient

An iterative algorithm for solving large, sparse linear systems (or minimizing quadratic functions) that is more efficient than Gaussian elimination for high-dimensional problems. Used in robotics for real-time dynamics computation (articulated body algorithm involves solving linear systems) and large-scale trajectory optimization.

Math

Covariance Matrix

A square matrix encoding the variances of and correlations between elements of a random vector. In robotics, covariance matrices appear in: Kalman filters (state uncertainty), Gaussian process models, measurement noise models, and pose uncertainty representations. The diagonal contains individual variances; off-diagonal elements represent correlations.

MathSensors

Cross Product

A vector operation producing a vector perpendicular to both input vectors with magnitude equal to the area of the parallelogram they span. Cross products are essential in robotics for: computing joint torques from forces (τ = r × F), computing angular velocities in Jacobian derivation, and finding normal vectors for surface computations.

MathKinematics

Collaborative Manufacturing

A manufacturing paradigm where cobots and humans work side by side on shared tasks, combining human flexibility and cognitive skills with robot precision and endurance. Applications include screw driving, part presentation, quality inspection, and machine tending. ISO/TS 15066 governs the safety requirements for collaborative manufacturing workspaces.

ApplicationsIndustrialSafety

Causal Inference Robotics

Applying causal reasoning to robot learning: identifying cause-effect relationships (rather than correlations) between observations and outcomes. Causal inference helps robots generalize to new environments by learning causal structure (e.g., object color causes visibility but not graspability). Methods include causal discovery, do-calculus, and counterfactual reasoning.

Robot Learning

Chain-of-Thought Planning

Using large language models' chain-of-thought prompting to generate step-by-step task plans for robots. The LLM reasons through a task (e.g., 'to make coffee: 1. Find mug 2. Fill with water 3. Heat...') and each step is grounded to executable robot actions via skill primitives or VLAs. CoT planning improves long-horizon task success rates.

Robot LearningVision-LanguagePlanning

Code as Policies

A paradigm where LLMs generate Python code that directly controls robots, rather than natural language plans. The LLM writes executable code calling robot APIs, enabling precise, loop-based behaviors difficult to express in natural language. Code as Policies naturally handles spatial reasoning (for loops over objects) and conditional logic.

Robot LearningVision-LanguageSoftware

Constraint Learning

Learning task constraints (geometric constraints, contact constraints, safety constraints) from demonstrations, rather than hand-specifying them. A robot learns that 'the cup must stay upright' from demonstrations where cups are always kept upright. Learned constraints can then be enforced during autonomous execution via constrained optimization.

Robot LearningSafety

Contrastive Predictive Coding

A self-supervised representation learning method that trains an encoder to predict future latent representations from current observations. CPC learns temporally coherent representations by solving the pretext task of identifying the correct future frame among distractors. In robotics, CPC pre-trains visual encoders from unlabeled robot video, improving downstream manipulation policy performance.

Robot LearningRepresentation Learning

Context-Conditioned Policy

A policy that receives a context vector encoding task-specific information (current task ID, goal description, demonstration embedding) alongside observations. Context conditioning enables a single policy to handle multiple tasks by switching behavior based on context. Task encoders (from language, goal images, or few-shot demos) provide the context vector.

Robot LearningPolicy

Continual Learning

Learning new tasks sequentially without forgetting previously learned ones (also called lifelong learning). Neural networks are prone to catastrophic forgetting — updating weights for new tasks overwrites old task knowledge. Methods include: EWC (Elastic Weight Consolidation), progressive networks, memory replay, and modular architectures. Critical for real-world robot deployment where new tasks are added over time.

Robot Learning

D

Data Augmentation (for robotics)

Data augmentation in robot learning applies random transformations to training observations to improve policy robustness without collecting additional demonstrations. Common image augmentations include random cropping, color jitter, Gaussian blur, and cutout. More sophisticated augmentations overlay distracting backgrounds, change lighting conditions, or inject sensor noise to prevent overfitting to specific visual features in the training environment. Some approaches augment actions as well — for example, adding noise to joint trajectories to teach the policy to recover from perturbations. Augmentation is especially important when training data is expensive (each demonstration requires human operator time).

TrainingRobustnessData

Degrees of Freedom (DOF)

Degrees of freedom describes the number of independent parameters needed to specify the configuration of a mechanical system. A robot arm with six revolute joints has 6 DOF — enough to position and orient its end-effector arbitrarily within its reachable workspace (barring singularities). A 7-DOF arm adds one redundant joint that allows null-space optimization for obstacle avoidance or comfort poses. Human arms have roughly 7 DOF at the shoulder-elbow-wrist chain, making 7-DOF robots natural choices for anthropomorphic manipulation. Mobile bases add 2–3 DOF; full humanoids exceed 30 DOF.

KinematicsHardware

Demonstration

A demonstration (also called a trajectory or episode in imitation learning contexts) is a recorded sequence of observations and actions provided by a human or expert controller that illustrates how to perform a task. Demonstrations are the primary data source for behavioral cloning and other imitation learning algorithms. They can be collected via teleoperation, kinesthetic teaching, or motion capture. Data quality — smooth motion, consistent task execution, adequate coverage of the task's state space — matters as much as quantity for downstream policy performance. SVRC collects production-quality robot demonstrations through our data services.

DataImitation Learning

Diffusion Policy

Diffusion Policy, introduced by Chi et al. (2023), formulates robot action generation as a denoising diffusion process — the same class of generative models used in image generation. At inference time, the policy iteratively refines a sample of Gaussian noise into a sequence of actions conditioned on the current observation using a learned score network (typically a CNN or transformer). Compared to deterministic behavioral cloning, Diffusion Policy naturally represents multimodal action distributions (multiple valid ways to perform a task) and achieves state-of-the-art results on contact-rich manipulation benchmarks. See the detailed article.

PolicyGenerative ModelImitation Learning

Dexterous Manipulation

Dexterous manipulation refers to fine, multi-fingered manipulation tasks that exploit the full kinematic and sensory capabilities of a robotic hand — in-hand regrasping, rolling objects across fingertips, card dealing, surgical suturing, and similar tasks. Dexterity requires high-DOF end-effectors (5+ fingers, each with 3+ joints), dense tactile sensing, and policies capable of reasoning about complex contact geometry. Reinforcement learning trained in simulation (e.g., OpenAI's Dactyl) and recent diffusion-based policies have pushed the frontier, but dexterous manipulation at human-level reliability remains an open research problem.

ManipulationHardwareResearch Frontier

D* Lite

An incremental replanning algorithm for shortest-path computation that efficiently updates the path when the map changes (e.g., new obstacles detected). Unlike A* which replans from scratch, D* Lite reuses previous computation, making it suitable for real-time navigation in partially known environments where sensor updates continuously reveal new information.

NavigationPlanning

DAgger

Dataset Aggregation — an interactive imitation learning algorithm that addresses distributional shift in behavioral cloning. DAgger iteratively: (1) runs the current policy, (2) queries the expert for corrections at the visited states, and (3) aggregates the new labeled data with the existing dataset for re-training. This on-policy data collection ensures the policy learns to recover from its own mistakes.

Robot LearningImitation Learning

Data Augmentation Robotics

Artificially expanding a robot dataset by applying transformations that preserve label validity: image augmentations (crop, color jitter, blur), proprioceptive noise injection, action perturbations, and viewpoint synthesis. Data augmentation is critical for sample-efficient robot learning — random crop augmentation alone can double the effective dataset size and significantly improve policy generalization.

DataRobot Learning

Data Flywheel

A virtuous cycle where a deployed robot policy generates new interaction data, which is used to improve the policy, which generates better data, and so on. The concept is central to real-world robot learning systems — each deployment cycle adds to the training dataset. Self-improvement loops, active learning, and fleet learning are mechanisms that drive the data flywheel.

Robot LearningDeployment

Dead Reckoning

Estimating the robot's current position by integrating velocity or displacement measurements from a known starting point, without external references. Wheel encoders and IMUs provide dead-reckoning data. Accumulated errors (drift) grow without bound, so dead reckoning is typically fused with absolute measurements (GPS, visual landmarks, LiDAR scan matching) to maintain accuracy.

NavigationSensors

Deformable Object Manipulation

Manipulating objects that change shape under applied forces: cloth, rope, food items, bags, soft toys. Deformable objects violate the rigid-body assumption underlying most grasp planning and motion planning algorithms. Simulation-based approaches (with FEM or position-based dynamics) and learning-based methods are the two main research directions.

Manipulation

Degrees of Freedom

The number of independent parameters that define a system's configuration. A rigid body in 3D space has 6 DOF (3 translation + 3 rotation). A typical industrial robot arm has 6 DOF. A robot with more DOF than needed for a task is kinematically redundant, providing extra flexibility for obstacle avoidance and optimization.

KinematicsHardware

Delta Robot

A parallel-link robot with three arms connected to a common mobile platform, providing 3 DOF of translational motion at very high speed. Delta robots are used in food packaging, pharmaceutical sorting, and electronics assembly where pick-and-place cycle times under 1 second are required. Adding a fourth actuator enables rotation of the end-effector.

HardwareIndustrial

Demonstrations

Expert-provided examples of task execution used to train imitation learning policies. Demonstrations typically consist of observation-action pairs recorded during human teleoperation or kinesthetic teaching. Quality, diversity, and quantity of demonstrations directly impact learned policy performance. Common formats include HDF5 episode files, RLDS records, and LeRobot datasets.

Robot LearningImitation LearningData

Denavit-Hartenberg Parameters

A standardized parameterization for describing the geometry of serial kinematic chains using four parameters per joint: link length (a), link twist (α), link offset (d), and joint angle (θ). DH parameters enable systematic computation of forward kinematics as a product of homogeneous transformation matrices. They are the standard in industrial robot documentation and textbooks.

Kinematics

Depth Camera

A sensor that captures per-pixel depth (distance) information in addition to color. Technologies include structured light (Intel RealSense D400), time-of-flight (Azure Kinect, L515), and active stereo. Depth cameras produce point clouds or depth maps used for object detection, grasping, obstacle avoidance, and 3D scene understanding. Range, accuracy, and sunlight robustness vary by technology.

SensorsVisionHardware

Dexterous Hand

A multi-fingered robotic hand with enough degrees of freedom (typically 12–24) to perform in-hand manipulation: repositioning, reorienting, and manipulating objects within the hand. Research dexterous hands include the Shadow Hand, Allegro Hand, and LEAP Hand. Dexterous hands are essential for tasks requiring human-like manipulation dexterity.

HardwareDexterousGrasping

Diffusion Policy

A robot policy that generates actions using a denoising diffusion process — starting from random noise and iteratively refining it into a coherent action sequence. Diffusion policies excel at representing multi-modal action distributions and producing smooth, temporally consistent trajectories. They have achieved state-of-the-art results on contact-rich manipulation tasks. Inference is slower than feedforward policies due to the iterative denoising steps.

Robot LearningPolicy

Digital Twin Robotics

A synchronized virtual replica of a physical robot and its environment that mirrors real-time state and behavior. Digital twins enable remote monitoring, predictive maintenance, virtual commissioning, and what-if simulation. They are powered by continuous data streams from the physical system and high-fidelity physics simulation of the virtual counterpart.

SimulationIndustrial

Direct Drive

A motor configuration where the rotor is coupled directly to the load without any gear reduction. Direct-drive joints have zero backlash, very low friction, and full backdrivability, making them ideal for force-sensitive tasks and kinesthetic teaching. The downside is lower torque density compared to geared systems — the motor must be larger and heavier to produce equivalent torque.

HardwareActuation

Docker in Robotics

Using Docker containers to package robot software stacks (ROS nodes, dependencies, configurations) for reproducible deployment. Docker eliminates 'works on my machine' problems, enables versioned deployment of robot software, and simplifies multi-robot fleet management. GPU support (NVIDIA Container Toolkit) enables containerized inference for perception and learning workloads.

SoftwareDeployment

Domain Randomization

A sim-to-real technique that trains policies in simulation with heavily randomized environment parameters (lighting, textures, physics properties, camera poses, object shapes) so the policy learns to be robust to the variations it will encounter in the real world. If the randomization distribution is wide enough to encompass reality, the policy transfers without any real-world fine-tuning.

SimulationSim-to-Real

Drake

An open-source C++/Python toolbox for model-based design, simulation, and control of robots, developed at MIT and Toyota Research Institute. Drake provides multibody dynamics, optimization-based planning, and control tools. It is widely used in manipulation research for its accurate contact simulation and integration with mathematical programming solvers.

SimulationSoftware

Dual-Arm Robot

A robotic system with two independently controlled arms, enabling bimanual manipulation. Dual-arm robots can perform tasks that require coordinated two-handed operations: holding and screwing, folding, handover between arms. Examples: ABB YuMi, Baxter/Sawyer, ALOHA. Coordination control between the two arms is a key challenge.

HardwareManipulation

DWA

Dynamic Window Approach — a velocity-based local planner for mobile robots that samples candidate velocities (linear and angular) within the robot's dynamic constraints (acceleration limits), simulates forward trajectories for each, and selects the velocity that best balances progress toward the goal, obstacle clearance, and velocity preference. DWA is widely used in ROS navigation.

NavigationPlanning

Dynamics Model

A mathematical model describing how forces and torques produce motion in a robot. The standard form is the manipulator equation: M(q)q̈ + C(q,q̇)q̇ + g(q) = τ, where M is the mass matrix, C is Coriolis/centrifugal, g is gravity, and τ is joint torque. Accurate dynamics models are essential for computed torque control, simulation, and model-based reinforcement learning.

Dynamics

Data Loader

A software component that efficiently loads, preprocesses, and batches training data for neural network training. In robot learning, data loaders must handle heterogeneous data: images (decoding, augmentation), proprioception (normalization), actions, and variable-length episodes. PyTorch DataLoader with multiple workers and prefetching is the standard implementation.

MLSoftware

Diffusion Model

A generative model that learns to reverse a gradual noising process: training adds Gaussian noise to data over T steps, and the model learns to denoise at each step. Sampling generates data by starting from pure noise and iteratively denoising. Diffusion models produce high-quality, diverse samples and have been adapted for robot action generation as Diffusion Policies.

MLGenerative

DINOv2

A self-supervised vision transformer model trained by Meta on a curated dataset of 142M images using a self-distillation objective. DINOv2 learns powerful visual representations without any labels. Its features transfer well to robotics tasks — manipulation policies using frozen DINOv2 encoders achieve strong performance with minimal fine-tuning on robot data.

MLVisionRepresentation Learning

Discriminator

The component of a GAN or adversarial training setup that classifies inputs as real (from the data distribution) or fake (from the generator). In robotics, discriminators are used in GAIL (Generative Adversarial Imitation Learning) to distinguish expert demonstrations from policy rollouts, providing a learned reward signal for training the policy.

MLGenerative

Dropout

A regularization technique that randomly sets a fraction of neural network activations to zero during training, preventing co-adaptation of neurons. Dropout reduces overfitting and improves generalization. At inference time, all neurons are active but scaled. Dropout is commonly applied in policy networks to prevent overfitting to limited robot demonstration data.

MLTraining

DK1 Platform

A bimanual manipulation development kit by Silicon Valley Robotics Center, consisting of two robot arms with integrated teleoperation capabilities. DK1 is designed for collecting high-quality demonstration data for robot learning research. It supports multiple teleoperation modalities and records synchronized multi-camera, proprioceptive, and action data.

HardwareTeleoperationManipulation

Digit

A bipedal humanoid robot by Agility Robotics designed for logistics and warehouse operations. Digit has two arms and human-proportioned legs, enabling navigation in human-built environments. It can pick, carry, and place tote bins and navigate through doorways and ramps. Amazon has deployed Digit robots in its fulfillment centers for testing.

HardwareHumanoidLocomotion

DROID

Distributed Robot Interaction Dataset — a large-scale robot manipulation dataset collected across multiple institutions, featuring diverse robots, environments, and tasks. DROID provides a standardized collection protocol and data format, contributing to the development of generalist manipulation policies through co-training on diverse data sources.

DatasetRobot Learning

DDPG

Deep Deterministic Policy Gradient — an off-policy RL algorithm for continuous action spaces that uses a deterministic policy and a Q-function, trained via the deterministic policy gradient theorem. DDPG uses experience replay and target networks for stability. It has been widely applied to robot control tasks but can be sensitive to hyperparameters. TD3 (Twin Delayed DDPG) addresses its overestimation bias.

DQN

Deep Q-Network — a value-based RL algorithm that uses a neural network to approximate the action-value function Q(s,a), selecting actions by maximizing Q. DQN introduced experience replay and target networks for stable training. While primarily used for discrete action spaces (Atari games), DQN variants have been adapted for discretized robot control.

Delivery Robot

An autonomous mobile robot that transports packages, food, or supplies along sidewalks, corridors, or roads. Last-mile delivery robots (Starship, Nuro, Serve Robotics) use GPS, LiDAR, and cameras for navigation. Indoor delivery robots are deployed in hospitals, hotels, and offices. Key challenges include pedestrian interaction, traffic rules, and weather robustness.

ApplicationsMobile RoboticsNavigation

Drone

An unmanned aerial vehicle (UAV), typically a multirotor or fixed-wing aircraft capable of autonomous or remotely piloted flight. Drones are used in robotics for aerial inspection, mapping, delivery, agricultural monitoring, and search-and-rescue. Key technologies include GPS waypoint navigation, visual SLAM, obstacle avoidance, and automated landing.

ApplicationsMobile Robotics

Depth Completion

Predicting a dense, complete depth map from sparse depth measurements (LiDAR) and optionally an RGB image. LiDAR provides accurate but sparse depth; depth completion fills in missing values. Deep learning methods (CSPN, GuideNet) use image structure to hallucinate geometrically plausible depth at image resolution. Used in autonomous vehicles and RGB-D robot perception.

VisionSensors

Depth Estimation

Predicting the distance from the camera to scene points. Methods include: stereo matching (geometric, requires calibrated stereo pair), structured light (requires active illumination), time-of-flight, and monocular depth estimation (learning-based, scale-ambiguous). Accurate depth is essential for 3D grasping, obstacle avoidance, and SLAM.

Vision

Dense Optical Flow

Computing per-pixel velocity vectors between consecutive frames, describing how each pixel moved. Classical methods (Horn-Schunck, TV-L1) minimize energy functionals; deep methods (RAFT, FlowFormer) achieve state of the art. Dense optical flow is used for motion segmentation, action recognition, and deformable object tracking in robotics.

Vision

Disparity Map

A pixel-wise image where each value is the horizontal displacement between corresponding pixels in a rectified stereo image pair. Disparity is inversely proportional to depth: d = fB/Z, where f is focal length, B is baseline, and Z is depth. Disparity maps are the intermediate representation in stereo depth estimation.

Vision

Duty Cycle

In legged locomotion, the fraction of each gait cycle that a leg spends in stance (contact with ground) vs. swing. Slower gaits (walk) have high duty cycles (>0.5); faster dynamic gaits (trot, gallop) have lower duty cycles. Duty cycle determines the overlap between stance phases and the stability properties of the gait.

Locomotion

Dynamic Gait

A locomotion pattern where the robot is dynamically unstable during execution — the center of mass projection leaves the support polygon — relying on inertial effects and timed footsteps to maintain overall stability. Trotting, bounding, galloping, and jumping are dynamic gaits. They are faster and more energy-efficient than static gaits but harder to control.

Locomotion

Deformable Grasp

Grasping deformable objects (bags, clothing, cables, food) where the object shape changes during interaction. Deformable grasps require adaptive force control and visual feedback to handle variable object geometry. The gripper must apply sufficient force to hold the object without excessive deformation or damage.

ManipulationGrasping

Dynamic Grasp

Grasping a moving object, requiring the robot to predict the object's trajectory, plan a grasp pose on the predicted trajectory, and intercept it with precise timing. Dynamic grasping extends pick-and-place to moving targets (conveyor belt items, tossed objects) and requires tight integration of perception, prediction, and motion planning.

ManipulationGrasping

DDS

Data Distribution Service — the publish-subscribe middleware standard underlying ROS 2. DDS provides Quality of Service (QoS) policies, auto-discovery of nodes, and deterministic message delivery guarantees. Different DDS implementations (FastDDS, Cyclone DDS, Connext) provide different performance-reliability trade-offs for ROS 2 robot systems.

Software

Differential Flatness

A system property (same as Flatness/flatness-control) where all states and inputs can be expressed as functions of a flat output and its derivatives. Differentially flat systems allow exact trajectory generation without solving differential equations — trajectories in flat output space exactly satisfy the dynamics. Quadrotors and car-like robots are differentially flat.

MathControlPlanning

Dual Numbers

An extension of real numbers using a nilpotent element ε (ε² = 0). Dual numbers enable simultaneous computation of a function and its derivative in a single evaluation (automatic differentiation). In robotics, dual quaternions represent rigid body transforms and their derivatives compactly, enabling efficient Jacobian computation in kinematic chains.

MathKinematics

Digital Manufacturing

An integrated approach to manufacturing that uses digital technologies (simulation, IoT, AI, AR) to design, plan, produce, and optimize products. Digital manufacturing leverages digital twins, generative design, and robot simulation to compress development cycles and improve manufacturing quality. It is the operational pillar of Industry 4.0.

ApplicationsIndustrial

Drone Swarm

A coordinated fleet of multiple drones operating collectively under distributed control, performing tasks beyond the capability of a single drone: wide-area coverage, redundant sensing, formation-based payload transport. Swarm coordination uses decentralized algorithms based on local sensing and communication, analogous to bird flocking.

ApplicationsMobile Robotics

Dataset Curation

The process of selecting, filtering, cleaning, and organizing demonstration data for robot learning. Curation removes low-quality episodes (failed attempts, operator errors), balances task and object diversity, augments underrepresented cases, and structures data for efficient training. High-quality curated datasets dramatically outperform equivalent-size uncurated ones.

Robot LearningData

E

Embodied AI

Embodied AI refers to artificial intelligence systems that perceive and act through a physical body situated in the real world, rather than operating purely on text or images in isolation. The embodiment hypothesis holds that true intelligence requires sensorimotor grounding — learning through interaction, not just pattern matching on static datasets. In practice, embodied AI research encompasses robot learning, VLA models, sim-to-real transfer, and physical foundation models. Companies like Google DeepMind (RT series), Physical Intelligence (pi0), and NVIDIA (GR00T) are the primary industrial drivers. SVRC's own data platform is built for embodied AI data workflows.

Foundation ModelPhysical AI

End-Effector

The end-effector is the device at the distal end of a robot arm that directly interacts with the environment. It may be a parallel-jaw gripper, a suction cup, a multi-finger hand, a welding torch, a paint nozzle, or any task-specific tool. The end-effector's pose — its position and orientation in space — is the primary control output for most manipulation policies. The tool center point (TCP) is the reference point on the end-effector used for Cartesian control. Choosing the right end-effector is a critical deployment decision: grippers optimized for one object class (e.g., rigid boxes) may fail on soft or irregular items. Browse SVRC hardware options.

HardwareManipulation

Episode

An episode is a single, complete attempt at a task — from the initial state to either task success, failure, or a timeout. In reinforcement learning, the agent interacts with the environment for one episode, accumulates rewards, and then the environment is reset. In imitation learning, each recorded demonstration constitutes one episode. Episodes are the fundamental unit of robot learning datasets: a dataset of 1,000 episodes contains 1,000 task attempts with associated observations, actions, and outcomes. Episode length, reset conditions, and success criteria must be precisely defined to ensure consistent data collection.

DataReinforcement LearningImitation Learning

Extrinsics (camera)

Camera extrinsics define the position and orientation (6-DOF pose) of a camera relative to a reference frame — typically the robot base or end-effector. Together with intrinsic parameters (focal length, principal point, lens distortion), extrinsics allow projecting 3D world points onto the image plane and, conversely, lifting 2D detections into 3D space. Accurate extrinsic calibration is critical for visuomotor policies that must map visual observations to robot actions in a consistent coordinate frame. Eye-in-hand (wrist-mounted) cameras require re-calibration when the end-effector or camera is replaced.

PerceptionCalibration

Edge Computing Robotics

Processing data locally on the robot or on nearby edge servers rather than in the cloud, reducing latency and bandwidth requirements. Edge computing is critical for real-time robot perception and control where cloud round-trip times (>50ms) are unacceptable. NVIDIA Jetson, Intel NUC, and Qualcomm RB5 are common edge computing platforms for robots.

SoftwareHardware

Emergency Stop

A safety mechanism that immediately halts all robot motion when activated, typically via a physical button. E-stop systems are mandatory for industrial robots (ISO 10218) and must be hardwired (not software-only) to ensure reliability. Performance levels (PLe per ISO 13849) specify the required safety integrity of the e-stop circuit.

SafetyHardware

Encoder

A sensor that measures the angular position or velocity of a rotating shaft. Incremental encoders output pulse trains proportional to rotation; absolute encoders report the exact angle at power-on. Encoders are fundamental to closed-loop motor control and joint position sensing. Resolution is specified in counts per revolution (CPR) or bits; high-end robot arms use 17–20 bit absolute encoders.

SensorsHardware

End-to-End Learning

Training a single neural network to map directly from raw sensor inputs (images, proprioception) to robot actions, without hand-designed intermediate representations or modular pipelines. End-to-end learning avoids the information bottleneck of modular systems but requires more training data and is harder to interpret. VLA models represent the frontier of end-to-end robot learning.

Robot Learning

Energy-Based Model

A generative model that assigns an energy (scalar) to each input-output pair; low energy indicates high compatibility. In robot learning, EBMs can represent implicit policies where the action is found by minimizing the energy function at test time. EBM policies handle multi-modality naturally — multiple action modes correspond to multiple energy minima. Implicit Behavioral Cloning (IBC) is a notable example.

Robot LearningPolicy

Episode Format

The data structure used to store a single demonstration or rollout: a sequence of (observation, action, reward, done) tuples from start to termination. Common episode formats include HDF5 files (one file per episode), RLDS records (TFRecord sequences), and LeRobot datasets (Parquet + video). Standardized episode formats enable interoperability between data collection and training pipelines.

Data

Exploration

The process of actively moving through unknown space to build or extend a map. Exploration strategies include frontier-based exploration (navigate to the boundary between known and unknown space), information-gain approaches (maximize expected map information), and curiosity-driven methods. Autonomous exploration is essential for mapping unknown buildings, caves, or disaster sites.

NavigationSLAM

Embedding

A dense, learned vector representation of a discrete entity (word, token, object class) in a continuous vector space. Embeddings capture semantic relationships — similar entities have similar embeddings. In robotics, embeddings represent: language instructions (sentence embeddings), visual concepts (CLIP embeddings), task identifiers, and robot actions (in discretized action spaces).

MLRepresentation Learning

Encoder-Decoder

A neural network architecture with an encoder that compresses input into a latent representation and a decoder that generates output from this representation. The encoder-decoder pattern appears throughout robotics: image segmentation (U-Net), sequence-to-sequence models, VAEs for state representation, and policy architectures that encode observations and decode actions.

MLArchitecture

Epoch

One complete pass through the entire training dataset. Training typically runs for many epochs (10–1000), with the model seeing each data point multiple times. In robot learning, the number of epochs depends on dataset size — large pre-training datasets may need only a few epochs, while small demonstration datasets require many epochs with heavy augmentation.

MLTraining

Eldercare Robot

A robot designed to assist elderly individuals with daily living activities, health monitoring, social interaction, and mobility support. Eldercare robots range from social companions (providing reminders, video calls) to physical assistants (helping with transfers, fetching objects). They must be safe, intuitive, and culturally sensitive.

ApplicationsHRISafety

Energy Shaping

A passivity-based control technique that modifies the robot's closed-loop potential and kinetic energy to place a stable equilibrium at the desired configuration, while adding damping to dissipate energy. Energy shaping controllers are inherently safe (they can't inject unbounded energy) and have been applied to underactuated systems like pendulums and walking robots.

Control

Event-Triggered Control

A control paradigm where the controller updates only when a triggering condition is violated (e.g., the state error exceeds a threshold), rather than at fixed time intervals. Event-triggered control reduces communication and computation load in networked robot systems. It is particularly relevant for multi-robot coordination over bandwidth-limited networks.

ControlSoftware

Edge Detection

Finding boundaries between image regions by detecting sharp intensity changes. Canny edge detection (Gaussian smoothing + gradient magnitude + non-maximum suppression + hysteresis thresholding) is the classic algorithm. Edges provide structural cues for object recognition, grasp planning, and visual SLAM feature extraction.

Vision

Epipolar Geometry

The geometric relationship between two camera views of the same 3D scene, described by the fundamental matrix F or essential matrix E. For any point in one image, its correspondence in the other image lies on a line (the epipolar line). Epipolar geometry enables stereo matching (search 1D instead of 2D) and relative camera pose estimation.

VisionCalibration

Encoder Resolution

The number of discrete position steps an encoder can detect per revolution, expressed in counts per revolution (CPR) or bits. Higher resolution enables more precise position control. Typical values: 12-bit (4096 CPR) for budget servos, 17-20 bit (131,072–1,048,576 CPR) for high-precision robot arms. Resolution limits the minimum controllable position increment.

HardwareSensors

EtherCAT

Ethernet for Control Automation Technology — a real-time Ethernet fieldbus protocol processing frames on-the-fly as they pass through slave devices, achieving cycle times of 100μs–4ms with nanosecond synchronization. EtherCAT is the dominant communication protocol in collaborative robots (Franka, Universal Robots) for real-time servo loop communication.

SoftwareHardwareElectronics

Euler Angles

A set of three rotational angles (α, β, γ) representing a 3D orientation as a sequence of rotations about coordinate axes. Common conventions: ZYX (yaw-pitch-roll), ZYZ (for wrist kinematics). Euler angles are intuitive but suffer from gimbal lock (singularity) and non-unique representations. Quaternions are preferred for numerical computation.

MathKinematics

Euler-Lagrange Equation

The fundamental equation of Lagrangian mechanics: d/dt(∂L/∂q̇) - ∂L/∂q = Q, where L = T - V is the Lagrangian and Q is the generalized force. Applying the Euler-Lagrange equation to a robot's kinetic and potential energy yields the manipulator equation of motion M(q)q̈ + C(q,q̇)q̇ + g(q) = τ.

MathDynamics

Electronic Manufacturing Robot

A robot specialized for electronics assembly: PCB component placement (SMT pick-and-place at >10,000 CPH), soldering, inspection, wire bonding, and encapsulation. Electronics robots achieve micron-level precision and operate at high speed in cleanroom environments. They are the most precise and highest-throughput class of manufacturing robot.

ApplicationsIndustrial

Exoskeleton

A wearable robotic structure that augments the wearer's physical capabilities or provides rehabilitation. Industrial exoskeletons reduce fatigue and musculoskeletal injury for warehouse workers. Medical exoskeletons (HAL, ReWalk, Ekso) assist walking for people with paraplegia. Supernumerary exoskeletons add extra limbs rather than augmenting existing ones.

ApplicationsHRIHardware

Embodiment Gap

The performance degradation when transferring policies between robots with different morphologies, kinematics, or action spaces. Policies trained on ALOHA may not transfer to Franka because joint configurations, DOF, and sensor suites differ. Bridging the embodiment gap requires: embodiment-agnostic representations, action space normalization, and cross-embodiment co-training.

Robot LearningTransfer Learning

Equivariant Neural Network

A neural network architecture that is equivariant to specific symmetry transformations (rotations, translations, reflections) — when the input is transformed, the output transforms predictably. SE(3)-equivariant networks for 3D point cloud manipulation and rotation-equivariant policies are used in robotics to improve sample efficiency by hard-coding symmetry properties.

Robot LearningMLMath

Exploration-Exploitation Tradeoff

The fundamental dilemma in RL: the agent must balance exploiting known high-reward actions vs. exploring unknown actions that might yield higher rewards. Pure exploitation leads to local optima; pure exploration is inefficient. Methods like ε-greedy, UCB, Thompson sampling, and curiosity-driven exploration manage this tradeoff. In robotics, safe exploration constraints make the tradeoff harder.

Robot LearningRL

F

Force Torque Sensor (FT Sensor)

A force-torque sensor measures the six-axis wrench (three forces Fx, Fy, Fz and three torques Tx, Ty, Tz) applied at a robot's wrist or end-effector. FT sensors are essential for contact-rich and assembly tasks where pure position control would either miss contacts or apply excessive force. They enable impedance and admittance control loops, detect slip and collision, and provide rich sensory inputs for learned policies. High-precision FT sensors from ATI and Robotiq are standard in research labs; MEMS-based low-cost sensors are increasingly viable for production deployments.

HardwareSensingControl

Foundation Model (robotics)

A foundation model is a large neural network pretrained on broad, diverse data that can be adapted to many downstream tasks via fine-tuning or prompting. In robotics, foundation models are typically large vision-language models (VLMs) extended with action outputs to form VLAs, or large visuomotor policies trained on cross-embodiment datasets. Examples include RT-2 (Google DeepMind), OpenVLA, Octo, and pi0 (Physical Intelligence). Foundation models for robotics are appealing because they can leverage internet-scale pretraining, support language conditioning, and generalize across tasks without per-task retraining from scratch. See SVRC model catalog.

VLAPretrainingGeneralization

Forward Kinematics (FK)

Forward kinematics computes the end-effector's pose in Cartesian space given the robot's joint angles (or displacements for prismatic joints). For a serial chain robot, FK is computed by multiplying a sequence of homogeneous transformation matrices (one per joint), typically derived from Denavit-Hartenberg (DH) parameters or a URDF description. FK always has a unique solution — given joint angles, there is exactly one end-effector pose — unlike the inverse problem (IK), which may have zero, one, or many solutions. FK is used in simulation, collision checking, visualization, and real-time robot state monitoring.

KinematicsControl

Failure Recovery

The ability of a robot system to detect failures (dropped objects, missed grasps, collisions, sensor faults) and execute recovery strategies to continue the task. Failure recovery is essential for autonomous operation — a robot that cannot recover from errors requires constant human supervision. Approaches include re-planning, retry policies, and anomaly-triggered fallback behaviors.

DeploymentPlanning

Feedback Linearization

A nonlinear control technique that algebraically transforms a nonlinear system into an equivalent linear system by canceling the nonlinearities through the control input. In robotics, it is essentially the same as computed torque control — the inverse dynamics cancel the nonlinear terms, and a linear PD or PID controller handles the remaining linear dynamics.

Control

Few-Shot Imitation

Learning a new manipulation task from a very small number of demonstrations (typically 1–10), enabled by strong pre-trained representations or meta-learned task embeddings. Few-shot imitation is the practical goal for deploying robots in varied real-world settings where collecting hundreds of demos per task is infeasible. Methods include task-conditioned policies, prompt-based VLAs, and retrieval-augmented approaches.

Robot LearningImitation Learning

Finger Gaiting

A dexterous manipulation strategy where fingers alternately release and re-contact an object to achieve large reorientations or translations that exceed any single finger's range of motion. Finger gaiting is analogous to legged locomotion — maintaining a stable grasp while cycling contacts. It requires planning contact sequences and is an active research topic in dexterous manipulation.

DexterousManipulation

Finite State Machine

A computational model with a finite number of states, transitions between states triggered by events or conditions, and actions associated with states or transitions. FSMs are the traditional way to structure robot task logic: IDLE → APPROACHING → GRASPING → LIFTING → PLACING. They are simple but become unwieldy for complex behaviors with many states and transitions.

SoftwarePlanning

Fleet Learning

Aggregating data and experience from multiple deployed robots to collectively improve a shared policy or model. When one robot encounters a new scenario, the learned adaptation can benefit all robots in the fleet. Fleet learning accelerates the data flywheel proportionally to fleet size. Challenges include data heterogeneity, communication bandwidth, and privacy in multi-customer deployments.

Robot LearningDeployment

Flow Matching

A generative modeling framework that trains a neural network to predict a velocity field that transports samples from a simple prior distribution (e.g., Gaussian) to the data distribution along straight paths. Flow matching offers faster training and inference than diffusion models while achieving comparable or better sample quality. It is being explored as a faster alternative to diffusion policies for robot action generation.

Robot LearningPolicy

Force Closure

A grasp configuration that can resist arbitrary external wrenches (forces and torques) through appropriate adjustment of finger contact forces. A force-closure grasp can hold an object stably regardless of the direction of disturbance. It is the mathematical criterion for a 'good' multi-finger grasp. Checking force closure involves analyzing the contact wrench space.

ManipulationGrasping

Force-Torque Sensor

A multi-axis sensor that measures forces and torques applied to a robot's wrist, tool, or finger. Typical configurations provide 6 DOF (Fx, Fy, Fz, Tx, Ty, Tz). FT sensors enable force-controlled tasks like polishing, insertion, and safe human contact. Strain-gauge and capacitive designs dominate; MEMS variants are emerging for fingertip-scale integration.

SensorsHardwareControl

Forward Kinematics

The computation of end-effector pose (position + orientation) from joint angles using the kinematic chain model. For serial robots, FK is a straightforward chain of homogeneous transformations — one per joint. FK always has a unique, closed-form solution, unlike inverse kinematics. It is used in visualization, collision checking, and converting joint-space trajectories to Cartesian space.

Kinematics

Foundation Model

A large-scale neural network pre-trained on broad data (internet text, images, videos) that can be adapted to downstream tasks with minimal fine-tuning. In robotics, foundation models provide visual representations (DINOv2), language grounding (CLIP, LLMs), world models (video prediction), and directly output robot actions (VLAs like RT-2, OpenVLA, π0). They promise to amortize the data cost of learning across tasks and embodiments.

Robot LearningVision-Language

Functional Safety

The aspect of safety that depends on the correct functioning of safety-related control systems. In robotics, functional safety is governed by ISO 13849 (safety of machinery control systems) and IEC 62443 (cybersecurity for industrial systems). Safety functions include emergency stop, speed and force limiting, safety-rated monitored stop, and protective stop.

SafetyStandards

Feature Extraction

The process of computing informative, discriminative representations from raw sensor data. In robotics, visual feature extraction uses CNN or ViT encoders to transform images into compact feature vectors. Feature extraction can be end-to-end (learned jointly with the policy) or use frozen pre-trained encoders (DINOv2, CLIP). The quality of features directly impacts policy performance.

MLVision

Fine-Tuning

Adapting a pre-trained model to a specific downstream task by continuing training on task-specific data, typically with a lower learning rate. In robot learning, fine-tuning a pre-trained visual encoder or VLA on robot demonstration data is the standard transfer learning approach. Full fine-tuning updates all parameters; parameter-efficient methods (LoRA, adapters) update only a subset.

MLTraining

FPN

Feature Pyramid Network — a multi-scale feature extraction architecture that builds a pyramid of feature maps at different resolutions, enabling detection of objects at multiple scales. FPN is a standard component of modern object detectors (Faster R-CNN, RetinaNet) and is used in robotic perception for detecting objects of varying sizes in cluttered scenes.

MLVision

FANUC

A Japanese manufacturer of industrial robots, CNC systems, and factory automation equipment. FANUC is the world's largest maker of industrial robots, with over 1 million units installed globally. FANUC robots (M-series, LR Mate, CR-series cobots) are known for reliability, speed, and precision. They use the FANUC R-30iB controller and KAREL/TP programming.

HardwareIndustrial

Figure 01

A general-purpose humanoid robot by Figure AI, designed for warehouse and manufacturing tasks. Figure 01 stands 5'6" and weighs 60kg, with electric actuators and multi-fingered hands. The robot integrates an OpenAI-powered language model for natural language task instruction. Figure has partnerships with BMW for automotive manufacturing deployment.

HardwareHumanoid

Franka Emika Panda

A 7-DOF torque-controlled collaborative robot arm widely used in manipulation research. The Panda features joint torque sensors in every joint, enabling sensitive force control and kinesthetic teaching. Its open architecture (libfranka API, FCI) makes it the most popular research cobot. Key specs: 3kg payload, 855mm reach, ±0.1mm repeatability.

HardwareManipulation

Food Handling Robot

A robot designed for food preparation, packaging, or serving in compliance with food safety regulations. Food robots must handle variable, deformable items (produce, meat, baked goods) with appropriate hygiene. Applications include sushi preparation, pizza assembly, salad packaging, and restaurant cooking. Materials must be washdown-compatible and food-grade.

ApplicationsManipulationIndustrial

Feedback Linearization

A nonlinear control technique that uses state feedback to algebraically cancel nonlinearities, transforming the closed-loop system into a linear one. For fully actuated robots, feedback linearization via inverse dynamics exactly cancels gravity, Coriolis, and inertial terms. The remaining linear system is then controlled with standard linear methods (PD, LQR).

Control

Flatness

A property of a nonlinear system where all states and inputs can be expressed as algebraic functions of a flat output and its derivatives. Flat systems can be trajectory-planned entirely in the flat output space, then mapped back to states and inputs. Quadrotors (position and yaw as flat outputs) and wheeled robots are classic flat systems.

ControlPlanning

Funnel Control

A performance-based control strategy that guarantees the tracking error evolves within a prescribed time-varying funnel (performance envelope), without requiring explicit knowledge of the system model. The funnel boundaries tighten over time, ensuring convergence. Funnel control provides transient and steady-state performance guarantees with minimal model knowledge.

Control

Feature Matching

Finding corresponding features between two images by comparing their descriptors. SIFT, SURF, ORB, and BRISK are classical hand-crafted feature matchers. SuperGlue and LightGlue are learned matchers. Feature matching is used in visual SLAM (loop closure, relocalization) and object recognition. Ratio test and RANSAC filter outlier matches.

VisionSLAM

Fisheye Camera

A camera with an extremely wide-angle lens (field of view > 180°) that produces a heavily distorted image. Fisheye cameras are used in robots for wide-area coverage from a single camera — AMR navigation, inspection, and human-robot interaction monitoring. Equidistant projection models are calibrated with specialized calibration tools (Kalibr, OpenCV fisheye).

VisionHardwareSensors

FLIR Camera

A thermal infrared camera by FLIR Systems (now Teledyne FLIR), widely used in robotics for detection in complete darkness, smoke, or fog. FLIR cameras detect heat signatures, making them invaluable for search-and-rescue, electrical inspection, and industrial monitoring. Common models include Lepton (miniature), Boson (drone-grade), and A-series (lab)

VisionSensorsHardware

Fall Recovery

The capability of a robot to autonomously recover from a fallen position (lying on the ground) to a standing configuration. Fall recovery is critical for bipedal and quadrupedal robots operating in real environments where falls are inevitable. Recovery strategies include lying-to-sitting, sitting-to-kneeling, and kneeling-to-standing posture sequences.

LocomotionSafety

Flight Phase

A period during dynamic locomotion where all feet are simultaneously off the ground. Flight phases occur in running, jumping, and bounding gaits. Flight phase dynamics are governed purely by the robot's initial conditions at liftoff (ballistic trajectory). Managing flight phase duration and landing configuration is critical for dynamic legged locomotion.

Locomotion

Foot Clearance

The minimum height of the swinging foot above the ground during a locomotion step. Insufficient foot clearance causes stumbling on terrain irregularities. Terrain estimation (from depth sensors) informs adaptive foot clearance — higher obstacles require more clearance. Foot clearance is a primary safety constraint in leg trajectory generation.

Locomotion

Footstep Planning

Computing a sequence of foot placements that guide a legged robot from its current configuration to a goal, while satisfying kinematic, dynamic, and terrain constraints. Footstep planners use terrain maps (from LiDAR or stereo) to identify safe footholds, avoiding gaps, steep slopes, and unstable surfaces.

LocomotionPlanningNavigation

Frontal Plane Control

Control of a robot's motion in the lateral (left-right) plane, orthogonal to the sagittal (forward-backward) plane. For bipedal robots, frontal plane balance (left-right balance) is typically the harder control problem. Ankle strategy, hip strategy, and stepping strategy are hierarchically employed as perturbation magnitude increases.

LocomotionControl

Form Closure

A geometric grasp property where the object's motion is completely constrained by the contact geometry alone, without requiring friction forces. A form-closed grasp maintains the object in a fixed configuration even with frictionless contacts. It requires contacts from multiple directions. Force closure is a weaker requirement (frictional contacts can resist wrenches).

ManipulationGraspingMath

Friction Cone

The set of contact forces that can be applied at a contact point without causing slipping, shaped like a cone around the contact normal. The half-angle of the cone is arctan(μ), where μ is the coefficient of friction. Grasp planning algorithms verify that required contact forces lie within the friction cones for all grasping contacts.

ManipulationGraspingMath

Fieldbus

A digital communication network for real-time data exchange between robot controllers, servo drives, I/O modules, and sensors. EtherCAT (dominant in collaborative robots), PROFINET, DeviceNet, and CANopen are major fieldbuses. Fieldbus cycle times of 0.25–4ms enable tight synchronization of multi-axis servo loops.

HardwareSoftware

Flexible Joint

A robot joint with significant compliance between the motor shaft and the link, typically due to a flexible coupling or elastomer. Flexible joints arise from harmonic drives and belt drives. They cause vibration modes that complicate control. Joint flexibility models add extra DOF in the dynamics and require notch filters or state observers for stable high-gain control.

HardwareControlDynamics

FPGA in Robotics

Field-Programmable Gate Array — a reconfigurable digital hardware device used in robotics for real-time, low-latency computation: motor commutation, sensor signal processing, high-speed encoder counting, and custom control law implementation. FPGAs achieve deterministic <1μs latency, enabling ultra-high-bandwidth control loops impossible with general-purpose processors.

HardwareSoftwareElectronics

Factor Graph

A probabilistic graphical model where variable nodes (states to estimate) and factor nodes (constraints from measurements) are connected by edges. Inference on factor graphs (finding the maximum a posteriori estimate) solves SLAM and sensor fusion problems. GTSAM and g2o implement efficient sparse factor graph optimization for real-time SLAM.

MathSLAMSensors

FMCG Robot

Fast-Moving Consumer Goods robot — automated systems in FMCG supply chains for case packing, palletizing, labeling, and shelf replenishment. FMCG applications are characterized by high throughput (hundreds of items/minute), frequent SKU changeovers, and need for gentle handling of consumer packaging.

ApplicationsIndustrial

Fourth Industrial Revolution

Industry 4.0 — the current era of manufacturing characterized by cyber-physical systems, IoT, cloud computing, AI, and advanced robotics working in integrated ecosystems. Industry 4.0 technologies enable mass customization, predictive maintenance, real-time supply chain visibility, and autonomous production. Robotics is a central enabling technology.

ApplicationsIndustrial

G

Generalization (robot policy)

Generalization measures how well a robot policy performs on objects, scenes, or tasks it has not seen during training. It is the central challenge of robot learning: a policy that memorizes training demonstrations but fails on novel instances has no practical value. Researchers distinguish object generalization (new instances of known categories), category generalization (entirely new object classes), and task generalization (new instruction phrasings or goal configurations). Improving generalization typically requires larger and more diverse training data, co-training with internet data, domain randomization in simulation, and foundation model priors.

PolicyResearch Frontier

Grasp Pose

A grasp pose specifies the 6-DOF position and orientation of a robot hand or gripper relative to an object such that the gripper can close and securely hold the object. Grasp pose estimation is typically done from depth or point-cloud data using analytic methods (e.g., antipodal grasp sampling) or learned detectors such as GraspNet-1Billion, GQ-CNN, or AnyGrasp. A valid grasp pose must be reachable by the robot, collision-free during approach, and stable under the expected task loads. Grasp quality metrics include force-closure, contact stability, and task-specific wrench resistance.

ManipulationPerception

Gripper

A gripper is the most common class of robot end-effector, designed to grasp and hold objects. Parallel-jaw grippers are the simplest and most widely used, with two opposing fingers driven by a motor or pneumatics. Suction grippers use vacuum to pick smooth, flat surfaces. Soft grippers use compliant materials (silicone, fabric) to conform around irregular objects. Multi-fingered hands (3–5 fingers) enable dexterous manipulation but are harder to control and more expensive. Gripper selection depends critically on object geometry, surface properties, required payload, and whether in-hand reorientation is needed.

HardwareEnd-Effector

Gain Scheduling

A control strategy that adjusts controller gains based on the current operating point. In robotics, gain scheduling is used when a robot's dynamics change significantly across its workspace — for example, the inertia of a 6-DOF arm varies dramatically depending on joint configuration. Pre-computed gain tables or interpolation functions switch PID gains as joint angles change.

Control

Gaussian Process

A non-parametric probabilistic model that defines a distribution over functions, fully specified by a mean function and a kernel (covariance function). GPs are used in robotics for Bayesian optimization of controller parameters, model-based RL with uncertainty estimation, and safe exploration. They provide well-calibrated uncertainty estimates but scale poorly to large datasets (O(n³) training).

Robot LearningRLMath

Gazebo

An open-source robot simulator integrated with ROS, providing physics simulation (ODE, Bullet, DART, Simbody), 3D rendering, and sensor models. Gazebo is the traditional simulation environment for ROS-based robot development. Gazebo Classic is being superseded by Gazebo (formerly Ignition), which offers improved modularity, rendering, and distributed simulation.

SimulationSoftware

GelSight

A vision-based tactile sensor that uses a soft elastomer pad coated with a reflective membrane. When the pad contacts an object, an embedded camera captures the deformation of the membrane under controlled illumination, producing high-resolution geometry (depth map) and force estimates of the contact surface. GelSight sensors enable fine manipulation and texture recognition.

SensorsTactileHardware

Generalization

A policy's ability to perform correctly on inputs not seen during training: new objects, environments, lighting conditions, or task variations. Generalization is the central challenge in robot learning — policies easily overfit to training conditions. Techniques for improving generalization include data augmentation, domain randomization, pre-trained representations, and diverse training data.

Robot Learning

Generalized Coordinates

The minimal set of independent variables that completely describe the configuration of a mechanical system. For a serial robot arm, the generalized coordinates are the joint angles (or displacements for prismatic joints). For a free-flying base (e.g., drone, floating humanoid), six additional coordinates describe the base pose. The number of generalized coordinates equals the system's degrees of freedom.

KinematicsDynamics

Genesis

A generative physics simulation platform designed for robot learning, capable of generating diverse training environments procedurally. Genesis emphasizes fast, GPU-accelerated physics and differentiable simulation for gradient-based optimization. It represents the trend toward simulation platforms purpose-built for large-scale robot policy training.

SimulationSoftware

Global Planner

A path planning algorithm that computes a complete route from the robot's current position to the goal on a known map. Global planners (A*, Dijkstra, RRT, PRM) produce a reference path that the local planner then follows while avoiding dynamic obstacles. In ROS, the global planner runs on the static costmap at lower frequency than the local planner.

NavigationPlanning

Goal-Conditioned Policy

A policy trained to reach arbitrary goal states, where the goal is provided as an input alongside the current observation. Goals can be specified as target images, coordinates, object configurations, or language descriptions. Goal-conditioned policies enable a single model to execute many tasks by varying the goal, rather than training separate policies per task.

Robot LearningPolicy

Graph-Based SLAM

A SLAM formulation where robot poses and landmarks are nodes in a graph, and sensor measurements between them are edges with associated constraints. SLAM is solved by optimizing the graph to find the configuration of nodes that best satisfies all constraints. Graph optimization backends include g2o, GTSAM, and iSAM2. Most modern visual and LiDAR SLAM systems use graph-based formulations.

NavigationSLAM

Grasp Quality Metric

A numerical score assessing how good a grasp is, based on geometric and force analysis. Common metrics include the epsilon metric (largest disturbance wrench the grasp can resist), the volume of the grasp wrench space, manipulability at the grasp configuration, and learned quality scores from neural grasp evaluation networks. Higher-quality grasps are more robust to perturbations.

ManipulationGrasping

Gravity Compensation

A control technique that applies joint torques equal and opposite to the gravitational loads on each link, effectively making the arm feel weightless. Gravity compensation is the foundation of kinesthetic teaching — with gravity canceled, a human can freely guide the robot through demonstrations with minimal effort. It requires accurate knowledge of link masses and centers of mass.

ControlTeleoperation

GTSAM

Georgia Tech Smoothing and Mapping — a C++ library for factor graph-based optimization, widely used in SLAM, visual-inertial odometry, and sensor fusion. GTSAM provides efficient implementations of incremental solvers (iSAM2), nonlinear optimization (Levenberg-Marquardt), and probabilistic inference on factor graphs. It is the backend of many state-of-the-art SLAM systems.

SoftwareSLAM

Gyroscope

An inertial sensor that measures angular velocity (rotation rate) around one or more axes. MEMS gyroscopes are standard in robotics IMUs, providing data for orientation estimation and angular motion tracking. Drift over time (bias instability) is the key limitation; sensor fusion with accelerometers and magnetometers compensates for this.

SensorsHardware

GAN

Generative Adversarial Network — a framework where a generator network learns to produce realistic samples by competing against a discriminator network that tries to distinguish generated from real samples. In robotics, GANs are used for domain adaptation (translating simulated images to look real), data augmentation, and adversarial imitation learning (GAIL).

MLGenerative

Gradient Clipping

Capping the magnitude of gradients during training to prevent exploding gradients that can destabilize optimization. Gradient norms are clipped to a maximum value (typically 1.0–10.0). Gradient clipping is standard practice for training transformers, RNNs, and RL policies where reward variance can cause large gradient spikes.

MLTraining

GRU

Gated Recurrent Unit — a recurrent neural network architecture that uses gating mechanisms (reset and update gates) to selectively remember or forget information over time. GRUs are simpler than LSTMs with comparable performance. In robotics, GRUs are used in policies that require memory of past observations, such as POMDP settings where the full state is not observable.

MLArchitecture

GAIL

Generative Adversarial Imitation Learning — an algorithm that combines imitation learning with adversarial training. A discriminator learns to distinguish expert demonstrations from policy rollouts; the policy is then trained to fool the discriminator via RL. GAIL learns reward functions implicitly and achieves expert-level performance without hand-designed rewards.

RLImitation Learning

GAN Domain Adaptation

Using a GAN to translate images from a source domain (e.g., simulation) to look like a target domain (e.g., real world) without paired training data. CycleGAN and UNIT are standard architectures. In sim-to-real transfer, domain adaptation reduces the visual gap by transforming simulated images before feeding them to a vision policy.

VisionSim-to-Real

Gaussian Splatting

3D Gaussian Splatting (3DGS) is a scene representation using a collection of 3D Gaussians, each with position, covariance, color, and opacity. 3DGS renders photorealistic novel views at real-time rates and enables fast scene editing. In robotics, 3DGS is used for digital twin construction, manipulation scene understanding, and NeRF-alternative reconstruction.

Vision3D

Geometric Primitive Fitting

Fitting simple geometric shapes (planes, spheres, cylinders, boxes) to point cloud data using RANSAC or least-squares methods. Plane fitting with RANSAC is used for table detection and ground plane estimation in robot navigation. Cylinder fitting is used for pipe localization in inspection robots. Primitives provide compact, interpretable scene representations.

Vision3D

Gait Transition

Switching smoothly between different locomotion gaits (e.g., walk to trot, trot to gallop) in response to speed commands or terrain changes. Gait transitions must avoid discontinuities in joint trajectories and contact forces. RL-trained locomotion policies learn gait-emergence behavior — naturally transitioning between gaits based on speed without explicit gait selection logic.

Locomotion

Ground Reaction Force

The force exerted by the ground on the robot's foot during stance, equal and opposite to the contact force. GRF has normal (vertical, supporting weight) and tangential (horizontal, providing traction) components. The ratio of tangential to normal GRF must stay within the friction cone to avoid slipping. GRFs are measured by force plates in locomotion research and estimated from joint torques on robots.

LocomotionControlSensors

Grasp Affordance

The property of an object that enables it to be grasped — its handles, edges, flat surfaces, and other geometric features that support stable contact. Detecting grasp affordances from point clouds or images predicts where and how a gripper should make contact. Neural affordance models learn to predict grasping regions without explicit geometric analysis.

ManipulationVision

Gravity Grasp

A grasp that relies on gravity to maintain contact rather than active gripper force — placing a robot arm or plate under an object and relying on weight to keep it in position. Gravity grasps are used for large, flat objects (books, trays) that are difficult to pinch-grasp. They require careful approach path planning to slide under the object.

ManipulationGrasping

Gripper Calibration

Calibrating the relationship between gripper motor command and finger position/force, accounting for mechanical play, motor backlash, and sensor offsets. Gripper calibration ensures that commanded positions correspond to actual finger poses and that measured forces accurately reflect contact forces. Essential for precision assembly and delicate object handling.

ManipulationCalibration

Gear Ratio

The ratio of output shaft speed to input shaft speed (or equivalently, output torque to input torque, neglecting friction). A gear ratio of 100:1 reduces speed by 100× and increases torque by 100× (minus losses). High gear ratios increase payload capacity but reduce backdrivability and bandwidth. Most robot arm joints use ratios of 50:1–200:1.

HardwareActuation

Ground Plane

The common reference electrical potential (ground) in a robot's electronics. Proper ground plane design in PCBs and robot wiring prevents noise coupling, ground loops, and EMC emissions. Star grounding (all ground returns to a single point) and chassis isolation are important for preventing motor switching noise from corrupting sensor signals.

HardwareElectronics

GPU Computing

Using Graphics Processing Unit's massively parallel architecture for general-purpose computation. In robotics, GPUs accelerate: deep learning inference (perception, policies), parallel physics simulation (thousands of environment instances), point cloud processing, and training. NVIDIA Jetson AGX Orin is the standard edge GPU platform for robot onboard inference.

HardwareSoftwareML

Gauss-Newton

An iterative numerical algorithm for nonlinear least-squares optimization, approximating the Hessian as JᵀJ (where J is the Jacobian of residuals). Gauss-Newton converges faster than gradient descent for least-squares problems and is the basis of Levenberg-Marquardt. Used in robot calibration, bundle adjustment, and nonlinear state estimation.

Math

Goods-to-Person

A warehouse fulfillment paradigm where AMRs bring storage units (pods, shelves) to stationary human pickers, rather than humans walking to product locations. Goods-to-person systems (Amazon Robotics, Autostore) dramatically reduce picker travel time (70-80%), increasing productivity. The AMR fleet is managed by a central WCS (Warehouse Control System).

ApplicationsIndustrialMobile Robotics

Generative Robot Data

Using generative models (diffusion models, NeRF, video prediction) to synthesize additional robot training data beyond what is physically collected. Generated data can augment rare scenarios, novel object appearances, or failure cases. Quality of generated data matters — poorly generated data can harm performance. This is an active research area for data-efficient robot learning.

Robot LearningData

Graph Neural Network Robotics

Applying graph neural networks (GNNs) to robotics tasks where the domain has natural graph structure: multi-robot coordination (robots as nodes), multi-object manipulation (objects as nodes, relationships as edges), and robot kinematics (links as nodes, joints as edges). GNNs generalize better than MLPs to variable numbers of entities.

Robot LearningML

H

HDF5 (Hierarchical Data Format v5)

HDF5 is a binary file format and library for storing and accessing large, structured scientific datasets efficiently. In robotics, HDF5 is the standard container for robot demonstration datasets: a single file stores synchronized camera images, joint angles, gripper states, force readings, and metadata in hierarchical groups, with chunked I/O enabling fast random access during training. The LeRobot and ALOHA ecosystems both use HDF5 natively. The alternative Zarr format offers cloud-native chunked storage with better support for concurrent writes. SVRC's data collection pipelines output HDF5 by default.

DataStorageEngineering

Humanoid Robot

A humanoid robot has a body structure broadly similar to a human — typically a torso, two legs, two arms, and a head — enabling it to operate in environments designed for people and to use human tools. Notable humanoids include Boston Dynamics Atlas, Agility Robotics Digit, Figure 01, and Tesla Optimus. Humanoids present extreme engineering challenges: bipedal locomotion requires real-time balance control, and coordinating 30+ DOF for loco-manipulation tasks demands whole-body control. Despite this complexity, humanoids are attracting enormous investment because their form factor generalizes across diverse workplaces without infrastructure changes.

HardwareLocomotionBimanual

Human-Robot Interaction (HRI)

Human-robot interaction is an interdisciplinary field studying how people and robots communicate, collaborate, and share physical space effectively and safely. HRI research spans safety standards (ISO/TS 15066 for collaborative robots), user interface design for teleoperation, natural language instruction, legible robot motion (making robot intent readable to bystanders), and social robotics (using gaze, gesture, and speech for non-verbal communication). In industrial co-bot deployments, HRI directly determines whether workers accept and effectively use robots alongside them. Good HRI design reduces accidents, improves throughput, and lowers training burden on the human side.

SafetyCollaboration

Hand-Eye Calibration

Determining the rigid transformation between a camera and the robot's end-effector (eye-in-hand) or between a camera and the robot base (eye-to-hand). Hand-eye calibration is solved by collecting multiple pairs of (robot pose, camera observation of a known pattern) and solving AX=XB (eye-in-hand) or AX=ZB (eye-to-hand) matrix equations.

CalibrationVision

Harmonic Drive

A compact gear mechanism (also called strain-wave gear) that achieves very high reduction ratios (50:1–200:1) in a small package with near-zero backlash. Widely used in robot arm joints (UR, Franka, KUKA) for its compactness and precision. Disadvantages include limited backdrivability, nonlinear friction, and a characteristic flexibility that must be modeled in high-performance controllers.

HardwareActuation

HDF5 Format

Hierarchical Data Format version 5 — a binary file format designed for large, complex datasets. In robot learning, HDF5 is the dominant format for storing demonstration episodes: each file contains hierarchical groups for observations (images, proprioception), actions, rewards, and metadata. HDF5 supports compression, chunking, and random access, making it efficient for large datasets.

Data

Hexapod

A six-legged walking robot inspired by insect locomotion. Hexapods offer static stability — with six legs, the robot can always maintain a stable tripod of support while moving the other three legs. This makes hexapod locomotion simpler to control than bipedal or quadrupedal gaits. Hexapods are used for rough terrain traversal and scientific exploration.

LocomotionHardware

Hindsight Experience Replay

A data augmentation technique for goal-conditioned RL that relabels failed trajectories with the states actually reached as goals. If the robot tried to place a block at position A but reached position B, HER creates an additional training example where B is the goal and the trajectory is a success. This dramatically improves sample efficiency for sparse-reward goal-reaching tasks.

Robot LearningRL

Homogeneous Transformation

A 4×4 matrix that encodes both rotation (3×3 rotation matrix R) and translation (3×1 vector t) in a single linear algebra operation. Multiplying homogeneous transformation matrices chains coordinate frame transforms — this is how forward kinematics computes end-effector pose from joint angles. The bottom row is always [0, 0, 0, 1].

KinematicsMath

Human-Robot Interaction

The study and design of interactions between humans and robots, encompassing physical interaction (collaborative manipulation, kinesthetic teaching), social interaction (service robots, companions), and interface design (control panels, AR overlays, voice commands). HRI is multidisciplinary, drawing from robotics, cognitive science, psychology, and design.

HRISafety

Humanoid Robot

A robot with a human-like body plan: bipedal legs, torso, two arms, and a head. Humanoids are designed to operate in human environments using human tools and interfaces. They represent the most general-purpose robot form factor but also the most challenging to control. Key companies: Boston Dynamics, Figure, 1X, Agility, Unitree, Tesla (Optimus).

HumanoidHardware

Hybrid Force-Position Control

A control framework that independently controls force along some Cartesian directions and position/velocity along the complementary directions. The task space is partitioned using a selection matrix: force-controlled axes maintain desired contact forces while position-controlled axes follow geometric trajectories. Used for tasks like surface polishing, peg insertion, and contour following.

ControlManipulation

Hello Robot Stretch

A mobile manipulator designed for home assistance, featuring a telescoping arm, differential drive base, and Intel RealSense cameras. Stretch is lightweight (23kg), affordable (~$25K), and designed for assistive tasks: fetching objects, opening drawers, and helping with daily living. It runs ROS 2 and is used in assistive robotics research.

HardwareMobile RoboticsManipulation

Hospital Robot

A robot deployed in healthcare settings for tasks including: medication delivery, sample transport, disinfection (UV robots), surgical assistance, patient monitoring, and rehabilitation. Hospital robots must navigate crowded corridors, interact safely with patients and staff, and comply with medical device regulations. They are one of the fastest-growing service robot categories.

ApplicationsMobile Robotics

H-Infinity Control

A robust control framework that minimizes the worst-case gain from disturbance inputs to performance outputs (the H∞ norm of the closed-loop transfer matrix). H∞ controllers provide guaranteed robustness to bounded model uncertainty and disturbances. They are used in precision robot motion control where structured uncertainty in the dynamics model must be handled robustly.

Control

Homography

A projective transformation that maps points from one planar surface to another, represented as a 3×3 matrix. Homographies describe the relationship between two views of the same planar scene. In robotics, homographies are used for: bird's-eye view generation, planar object tracking, and image-to-image warping in visual servoing.

VisionCalibration

Human Pose Estimation

Detecting and localizing human body keypoints (joints) in images or video. 2D pose estimation predicts (x,y) pixel positions; 3D pose estimation predicts positions in 3D space. In robotics, human pose estimation enables HRI (detecting human intentions from body language), collision avoidance around people, and teleoperation via body motion capture.

VisionHRI

Handover

The act of transferring an object from one agent (robot or human) to another. Robot-to-human handovers require the robot to anticipate the human's grasp pose and timing, and release the object smoothly. Human-to-robot handovers require robust grasp pose estimation of the incoming object and smooth acceptance. Handover is a fundamental HRI manipulation primitive.

ManipulationHRI

Hybrid Grasp

A grasp combining multiple contact types — e.g., suction + parallel-jaw, fingers + palm, or gripper + magnetic contact. Hybrid grasps leverage the complementary strengths of different contact mechanisms: suction provides initial stability while fingers provide orientation control. They improve reliability for irregular-shaped or mixed-material objects.

ManipulationGrasping

Heat Pipe

A passive heat transfer device consisting of a sealed tube with working fluid that evaporates at the heat source and condenses at the heat sink, transferring heat with very high effective conductivity. Heat pipes are used in robot drive electronics and motor windings for efficient, silent cooling without pumps.

Hardware

Hollow Shaft

A motor or joint shaft with a central bore, allowing cables, pneumatic lines, or laser beams to pass through the center of the joint. Hollow shafts enable clean internal cable routing in robot arms, eliminating dangling cables that can catch on obstacles or wear through motion. They are standard in collaborative robots and exoskeletons.

HardwareActuation

Human-in-the-Loop

A system design where a human provides guidance, supervision, or correction during autonomous robot operation. Human-in-the-loop approaches balance automation efficiency with human expertise for edge cases. Active learning (querying human labels for uncertain predictions) and shared autonomy (human takes over when the robot is uncertain) are human-in-the-loop paradigms.

ApplicationsHRIRobot Learning

Hybrid Imitation RL

A training paradigm that combines imitation learning (from demonstrations) and reinforcement learning (from environment interaction) to leverage the sample efficiency of demonstrations while allowing RL to improve beyond the expert's performance. Methods include DAPG (demonstrations as a prior for policy gradient), IRL+RL, and Demo-Augmented PPO.

Robot LearningRLImitation Learning

I

Imitation Learning (IL)

Imitation learning is a family of machine learning methods that train robot policies from human demonstrations rather than from engineered reward functions. The simplest form is behavioral cloning (supervised regression on state-action pairs). More advanced variants — DAgger (iterative correction), GAIL (adversarial imitation), and IRL (recovering a reward function) — address the distributional shift and reward specification problems that plague pure BC. IL has become the dominant paradigm for teaching dexterous manipulation because reward engineering for complex manipulation is extremely difficult, whereas collecting human demonstrations is tractable at scale via teleoperation. See the full deep-dive article.

Core ConceptPolicyData

Inverse Kinematics (IK)

Inverse kinematics solves for the joint angles that place a robot's end-effector at a desired Cartesian pose. Unlike forward kinematics, IK may have zero, one, or infinitely many solutions depending on the robot's kinematic structure and the target pose. Analytic IK solvers exist for standard 6-DOF configurations; numerical methods (Jacobian pseudo-inverse, Newton-Raphson, optimization-based) handle arbitrary geometries and redundant robots. IK is used in motion planning, teleoperation mapping (converting operator hand pose to joint commands), and any Cartesian-space controller. Libraries like KDL, IKFast, and track-ik are commonly used in ROS environments.

KinematicsControlPlanning

Isaac Sim

NVIDIA Isaac Sim is a robotics simulation platform built on the Omniverse USD framework, providing high-fidelity physics (via PhysX 5), photo-realistic rendering (via RTX path tracing), and ROS 2 integration out of the box. It is purpose-built for generating synthetic training data, testing robot policies, and sim-to-real transfer research. Isaac Sim supports domain randomization of textures, lighting, and object poses at scale, and integrates with NVIDIA's Isaac Lab reinforcement learning framework. Its GPU-accelerated physics allows training RL policies with thousands of parallel simulation instances. Learn more at the SVRC Isaac Sim resource page.

SimulationSynthetic DataTool

Impedance Control

A control strategy that regulates the dynamic relationship between a robot's motion and the forces it exchanges with the environment. Instead of commanding pure position or pure force, impedance control makes the robot behave like a virtual mass-spring-damper system. It is the gold standard for contact-rich manipulation tasks and physical human-robot interaction.

ControlSafetyManipulation

IMU

An Inertial Measurement Unit combines accelerometers and gyroscopes (and sometimes magnetometers) to measure a body's specific force, angular rate, and magnetic field. IMUs are essential for legged robot balance, drone stabilization, and mobile robot odometry. 6-axis (3 accel + 3 gyro) and 9-axis (+ magnetometer) are common configurations. Sensor fusion algorithms convert raw IMU data into orientation estimates.

SensorsHardware

In-Context Learning

The ability of a large pre-trained model to adapt its behavior to new tasks based on a few examples provided in the input prompt, without updating model weights. In robotics, in-context learning enables VLA models to perform new manipulation tasks when shown a few demonstration frames as context. This parallels how GPT-style models handle few-shot text tasks.

Robot LearningVision-Language

In-Hand Manipulation

Repositioning or reorienting an object within the robot's grasp using finger motions, without placing it down. In-hand manipulation is essential for dexterous assembly and tool use. It requires multi-fingered hands with tactile sensing and sophisticated control. Tasks include object rotation, translation, and pivoting. RL with domain randomization has achieved impressive in-hand rotation results.

ManipulationDexterous

Inertia Tensor

A 3×3 symmetric matrix that describes how mass is distributed relative to a body's rotation axes. The diagonal elements are moments of inertia; the off-diagonal elements are products of inertia. Accurate inertia tensors for each link are required for dynamics computation, gravity compensation, and model-based control. They are typically estimated from CAD models or system identification experiments.

Dynamics

Insertion Task

A contact-rich manipulation task where the robot must insert a part (peg, connector, key) into a receptacle with tight tolerance. Insertion requires precise alignment, force control, and compliance to handle uncertainty. Peg-in-hole insertion is a classic robotics benchmark. Industrial insertion tasks (USB connectors, PCB components) are a major automation application.

ManipulationIndustrial

Inverse RL

Learning a reward function from expert demonstrations, under the assumption that the expert is (approximately) optimally maximizing that reward. IRL avoids the need to hand-engineer reward functions — instead, the reward is inferred and then used to train a policy via standard RL. Maximum entropy IRL and adversarial IRL (AIRL) are popular formulations. IRL is closely related to GAIL.

Robot LearningRL

Isaac Sim

NVIDIA's high-fidelity robot simulation platform built on the Omniverse USD framework. Isaac Sim provides photorealistic rendering, GPU-accelerated PhysX simulation, and synthetic data generation. It supports digital twins, sim-to-real transfer, and large-scale RL training. Isaac Lab (formerly OmniIsaacGymEnvs) provides RL training frameworks on top of Isaac Sim.

SimulationSoftware

ISO 10218

The international standard for industrial robot safety, consisting of two parts: Part 1 covers the robot itself (design, protective measures, verification), and Part 2 covers robot systems and integration (cell layout, safeguarding, commissioning). ISO 10218 defines safety requirements for industrial robots, including stopping functions, speed limits, and collaborative operation modes.

SafetyStandards

ISO/TS 15066

A technical specification providing guidance on collaborative robot safety, supplementing ISO 10218. It defines biomechanical limits for human-robot contact: maximum allowable forces and pressures for different body regions during transient and quasi-static contact. These limits are the quantitative foundation for collaborative robot safety assessments.

SafetyStandards

ImageNet

A large-scale visual recognition dataset containing 14 million images across 20,000+ categories. ImageNet pre-training established the paradigm of transfer learning in computer vision. In robotics, ImageNet-pre-trained CNNs (ResNet, EfficientNet) serve as default visual encoders. The features learned from ImageNet classification transfer surprisingly well to robot perception tasks.

DatasetVision

IQL

Implicit Q-Learning — an offline RL algorithm that avoids querying the Q-function on out-of-distribution actions by using an expectile regression objective. IQL achieves strong offline RL performance without the explicit conservatism of CQL, making it simpler to implement and tune. It has become popular for robot manipulation tasks trained on demonstration data.

Inspection Robot

A robot designed to inspect infrastructure (pipelines, bridges, power lines, wind turbines) or industrial assets (tanks, boilers, aircraft). Inspection robots carry cameras, ultrasonic sensors, or other NDT instruments to locations that are difficult, dangerous, or expensive for humans to access. They include crawlers, climbers, drones, and underwater ROVs.

ApplicationsMobile Robotics

Iterative Learning Control

A control method that improves tracking performance over repeated executions of the same task by using the error from previous trials to update the feedforward control signal. ILC is ideal for repetitive industrial tasks (welding, painting) where the same trajectory is executed thousands of times. It achieves near-perfect tracking without an accurate dynamics model.

Control

Image Segmentation Instance

A vision task that detects and delineates each individual object instance with a pixel-level mask. Mask R-CNN is the classic architecture; modern approaches include SOLOv2, CondInst, and SAM-based methods. Instance segmentation is used for robotic pick-and-place in cluttered scenes where multiple instances of the same object class must be individually distinguished.

VisionManipulation

Impedance Matching

Adjusting the mechanical impedance (stiffness and damping) of the robot end-effector to match the impedance of the task environment, minimizing energy exchange during contact. Soft environments (foam, tissue) benefit from low stiffness (compliant) control; rigid environments (metal assembly) benefit from higher stiffness. Impedance matching improves contact stability and reduces impact forces.

ManipulationControl

I2C Protocol

Inter-Integrated Circuit — a low-speed synchronous serial communication protocol using two wires (SDA data, SCL clock). I2C supports multiple devices on the same bus with up to 128 addresses. In robotics, I2C connects low-bandwidth sensors (temperature, pressure, IMU, LCD) to microcontrollers. Not suitable for real-time motor control due to low speed (100kHz–3.4MHz).

HardwareElectronicsSoftware

ICP Algorithm

Iterative Closest Point — an algorithm that aligns two point clouds by alternating between: (1) finding closest-point correspondences, and (2) computing the optimal rigid transform minimizing correspondence distances. ICP is the standard algorithm for LiDAR scan matching in robot mapping and object pose tracking from depth data.

MathVisionSLAM

Information Matrix

The inverse of the covariance matrix (also called the precision matrix), representing the certainty of an estimate. In information-form Kalman filters and factor graph optimization, working with information matrices is more natural for sparse systems — sensor observations add information (positive definite increments to the information matrix), while time propagation decreases it.

MathSLAMSensors

Inventory Management Robot

A robot that autonomously scans and audits warehouse inventory — tracking item locations, quantities, and conditions. Inventory robots (Simbe Tally, Gather AI drones) navigate store aisles or warehouse floors, capturing shelf images for automated counting and stock-out detection. They provide more frequent, accurate inventory data than periodic manual audits.

ApplicationsMobile Robotics

Incremental Learning

Adding new knowledge (tasks, objects, environments) to an existing robot learning system without full retraining. Incremental learning methods must balance plasticity (ability to learn new things) with stability (retaining old knowledge). In practice, incremental learning for robots involves careful dataset management, regularization, and modular architecture design.

Robot Learning

Interaction Data

Robot demonstration data recorded during physical contact with the environment, capturing force-torque measurements, contact states, and proprioceptive responses in addition to visual observations. Interaction data is richer than free-space data and is essential for learning contact-rich manipulation policies. It is more expensive to collect but provides crucial information about compliant task execution.

Robot LearningData

J

Joint Space (Configuration Space)

Joint space (also called configuration space or C-space) is the space of all possible joint angle vectors for a robot. A point in joint space uniquely specifies the robot's complete configuration. Motion planning algorithms like RRT and PRM work in joint space to find collision-free paths between configurations, since collision checking is more straightforward there than in Cartesian space. Many RL policies output joint positions or velocities directly in joint space, while imitation learning policies often operate in Cartesian space for easier human-demonstrator alignment. See the joint space article.

KinematicsPlanning

Joint Torque

Joint torque is the rotational force applied by a motor at a robot joint, measured in Newton-meters (Nm). Torque-controlled robots (as opposed to position-controlled ones) can regulate contact forces directly, enabling compliant behaviors such as yielding when pushed and precisely controlling assembly forces. Torque sensing at each joint is a key feature of collaborative robots (cobots) like the Franka Panda, Universal Robots UR series, and Kuka iiwa, enabling safe human-robot collaboration and whole-body compliant control. Learning policies that output joint torques rather than positions require careful training to avoid unstable oscillations.

ControlHardwareForce

Jacobian Matrix

A matrix of partial derivatives that maps joint velocities to end-effector velocities: ẋ = J(q)·q̇. The Jacobian is central to robot control — it enables Cartesian velocity control, force mapping (τ = Jᵀ·F), singularity analysis, and redundancy resolution. For an n-DOF arm with a 6D task, J is 6×n. The Jacobian is configuration-dependent and must be recomputed at each timestep.

KinematicsControl

Jerk Limit

The maximum allowed rate of change of acceleration (third derivative of position). Limiting jerk produces smooth motion profiles that reduce mechanical vibration, wear on actuators, and disturbance to payloads. Jerk-limited trajectory generators are standard in industrial robot controllers and collaborative robots where smooth motion improves both safety and precision.

ControlSafety

Joint Space Trajectory

A time-parameterized sequence of joint positions (and optionally velocities and accelerations) that define a desired robot motion. Joint space trajectories are generated by inverse kinematics and motion planners. Joint trajectory controllers (PD, PID, feedforward) track these references. Cubic splines and polynomial segments produce smooth, jerk-limited trajectories.

LocomotionControlKinematics

Joint Coupling

A mechanical device that connects two shafts while accommodating small misalignments, reducing vibration transmission, and providing overload protection. Types include: rigid couplings (shaft-to-shaft, precise), flexible couplings (accommodate misalignment, absorb vibration), and jaw couplings (elastomeric insert provides compliance).

HardwareActuation

Junction Box

An enclosure where electrical cables are connected, providing protection, organization, and access points for wiring. In industrial robots, junction boxes route power and signal cables from the controller to the robot arm and peripheral equipment. IP-rated junction boxes protect connections from dust and moisture in industrial environments.

HardwareElectronics

K

Kinematic Chain

A kinematic chain is a series of rigid body links connected by joints that together form a robot's mechanical structure. An open chain (serial robot arm) has one free end (the end-effector), making FK straightforward. A closed chain (parallel robot, hexapod) has multiple loops that provide higher stiffness and speed but require more complex kinematics. The kinematic chain determines the robot's workspace, singularities, and the Jacobian matrix used for Cartesian control. URDF files describe kinematic chains as a tree of links and joints for simulation and control software.

KinematicsMechanics

Kinesthetic Teaching

Kinesthetic teaching (also called lead-by-nose or direct guidance) is a method of robot programming where a human physically grasps the robot arm and moves it through the desired motion path while the robot records the trajectory. It requires the robot to be backdrivable (low joint friction and compliance) so the operator can move it with minimal effort. Kinesthetic teaching is intuitive and requires no external hardware, but it is limited to tasks the operator can physically demonstrate, and it produces only proprioceptive data (no wrist camera observations) unless cameras are co-recorded. Gravity compensation mode on torque-controlled robots like the Franka Panda makes kinesthetic teaching practical.

Data CollectionImitation Learning

Keypoint Detection

Detecting semantically meaningful points on objects or robot parts in images. In manipulation, keypoints define grasp points, insertion points, or object configurations. Dense object keypoints (like those from Transporter Networks) enable precise pick-and-place by detecting source and target keypoint locations. Keypoint-based state representations are compact and geometrically meaningful.

ManipulationVision

Kidnapped Robot Problem

A localization challenge where the robot is teleported to an unknown location on a known map without any information about its new position. The robot must re-localize itself purely from sensor observations. Particle filter-based localization (AMCL) and place recognition (NetVLAD, DBoW) are the standard solutions. It tests the robustness of localization systems to catastrophic initialization errors.

NavigationSLAM

Kinesthetic Teaching

A demonstration method where a human physically guides the robot through desired motions while the robot's joints are backdrivable or under gravity compensation. The joint positions are recorded as a demonstration trajectory. Kinesthetic teaching is intuitive and requires no programming skill, making it the primary demonstration method for collaborative robots (Universal Robots, Franka Emika).

TeleoperationImitation Learning

Knowledge Distillation

Training a smaller student model to match the outputs (soft logits) of a larger teacher model. The student learns a compressed version of the teacher's knowledge. In robotics, distillation is used to: compress large VLA models for real-time inference, transfer privileged-information teachers to vision-only students, and create efficient edge-deployable perception models.

MLTraining

KUKA iiwa

A 7-DOF lightweight collaborative robot arm with joint torque sensors, developed by KUKA. The iiwa (intelligent industrial work assistant) was the first industrial robot designed from the ground up for human-robot collaboration. Its impedance control mode and sensitivity make it popular for research in force-controlled manipulation and safe interaction.

HardwareManipulationSafety

Keyframe Selection

Choosing a sparse subset of video frames (keyframes) that adequately represent the sequence for SLAM, 3D reconstruction, or policy training. Keyframe selection criteria include sufficient parallax, minimum overlap, keypoint count threshold, and temporal spacing. Effective keyframe selection reduces computation while preserving mapping quality.

VisionSLAM

Knot Tying

A complex manipulation task requiring the robot to manipulate a flexible rope or cable through a specific topology to create a stable knot. Knot tying challenges include: non-prehensile manipulation, deformable object state estimation, and planning in configuration spaces of flexible objects. It requires both fine motor skills and spatial reasoning.

ManipulationDexterous

Kalman Filter

An optimal linear state estimator that combines a dynamics model (prediction) with sensor measurements (update) to minimize mean squared estimation error. The standard Kalman filter assumes linear dynamics and Gaussian noise. Extended Kalman Filter (EKF) linearizes nonlinear models; Unscented Kalman Filter (UKF) uses sigma points for better nonlinear approximation.

MathSensorsControl

L

Language-conditioned Policy

A language-conditioned policy takes a natural language instruction (e.g., "pick up the red cup and place it on the tray") as an additional input alongside visual observations, enabling a single policy network to perform multiple tasks selected at runtime without retraining. Language conditioning is typically implemented by encoding instructions with a pretrained language model (CLIP, T5, PaLM) and fusing the resulting embedding with image features. VLA models such as RT-2, OpenVLA, and pi0 are language-conditioned by design. This approach reduces the need to train separate policies per task and supports zero-shot generalization to novel instruction phrasings.

VLAFoundation ModelGeneralization

Latent Space

A latent space is a compressed, lower-dimensional representation of data learned by a neural network — the output of an encoder that captures the most task-relevant features of an observation. In robot learning, latent spaces are used in VAEs (variational autoencoders) for learning structured representations of visual scenes, in world models for predicting future states, and in CVAE-based policies (like ACT) for encoding multimodal action distributions. A well-structured latent space places semantically similar observations close together, enabling interpolation, planning, and data augmentation in the latent domain rather than in raw pixel space.

Representation LearningPolicy

LeRobot

LeRobot is Hugging Face's open-source library for robot learning, providing standardized implementations of imitation learning algorithms (ACT, Diffusion Policy, TDMPC), a unified dataset format, visualization tools, and pretrained model weights. It aims to lower the barrier to entry for robot learning research by providing a single cohesive framework analogous to what Transformers did for NLP. LeRobot integrates with the Hugging Face Hub for dataset and model sharing, and supports both simulated (gymnasium-robotics, MuJoCo) and physical robot environments. The companion SO-100 low-cost robot kit was released alongside it.

ToolOpen SourceImitation Learning

LeRobot HF Dataset

The LeRobot dataset format is a standardized schema for robot demonstration data hosted on the Hugging Face Hub. Each dataset consists of Parquet files (for scalar timeseries: joint positions, actions, rewards, done flags) plus compressed MP4 video chunks for camera streams, all indexed by episode and frame. A meta/info.json file describes camera names, robot type, fps, and data statistics used for normalization. This format allows any LeRobot-compatible algorithm to load any published dataset with a single line of code, enabling rapid cross-dataset experimentation. Dozens of manipulation and mobile manipulation datasets are already published in this format.

DataStandardOpen Source

Lagrangian Dynamics

A formulation of classical mechanics based on the Lagrangian L = T − V (kinetic energy minus potential energy). Applying the Euler-Lagrange equation to the Lagrangian of a robot system yields the manipulator equation of motion. This approach is the standard method for deriving robot dynamics models and is automated by symbolic computation tools like SymPy and Drake.

DynamicsMath

Language-Conditioned Policy

A policy that takes natural language instructions as input alongside visual observations and produces robot actions. Language conditioning enables flexible task specification: instead of hard-coding task IDs, users can say 'pick up the red block' or 'pour the water into the glass.' VLA models, SayCan, and Code-as-Policies are different approaches to language-conditioned robotic control.

Robot LearningVision-Language

Latency

The time delay between a sensor measurement or command input and the corresponding robot action. Low latency is critical for teleoperation responsiveness, visual servoing, and dynamic tasks. Sources of latency include sensor processing, network transmission, computation, and actuator response. End-to-end latency budgets for reactive manipulation are typically 30–100ms.

ControlTeleoperation

Learned Reward

A reward function that is learned from data (human preferences, demonstrations, or language descriptions) rather than hand-engineered. Reward learning methods include inverse RL, reward modeling from human comparisons (RLHF-style), and VLM-based reward scoring (using CLIP similarity or LLM evaluation). Learned rewards address the challenge that well-shaped manual reward functions are often difficult to specify for complex manipulation tasks.

Robot LearningRL

Legged Robot

A robot that uses articulated legs for locomotion instead of wheels or tracks. Legged robots can traverse rough, unstructured terrain that wheeled robots cannot. Categories include bipeds (2 legs), quadrupeds (4 legs), and hexapods (6 legs). RL-trained locomotion policies, combined with sim-to-real transfer, have recently enabled robust legged locomotion over challenging terrain.

LocomotionHardware

LeRobot Dataset Format

The standardized dataset format used by Hugging Face's LeRobot framework, storing episodes as Parquet files for tabular data (actions, proprioception) and video files for visual observations. LeRobot datasets are hosted on Hugging Face Hub, enabling easy sharing, versioning, and discovery. The format includes metadata files describing robot type, action space, and camera configuration.

DataSoftware

LeRobot Framework

An open-source Python framework by Hugging Face for training and evaluating robot learning policies. LeRobot provides a unified data format, pre-trained models (ACT, Diffusion Policy, VQ-BeT), simulation environments, and tools for recording real-robot demonstrations. It aims to democratize robot learning by making state-of-the-art policies accessible and reproducible.

SoftwareRobot Learning

LiDAR

Light Detection and Ranging — a sensor that measures distances by emitting laser pulses and timing their reflections. In robotics, 2D LiDAR (Hokuyo, RPLIDAR) provides planar scans for SLAM and obstacle avoidance; 3D LiDAR (Velodyne, Ouster, Livox) generates dense point clouds for autonomous vehicles and outdoor mobile robots. Key specs include range, angular resolution, scan rate, and multi-return capability.

SensorsNavigationHardware

Linear Actuator

An actuator that produces motion in a straight line rather than rotation. Common implementations include ball-screw, lead-screw, belt-drive, pneumatic cylinder, and voice-coil types. Linear actuators are used in prismatic joints, grippers, and Cartesian gantry robots. Stroke length, force, speed, and repeatability are the primary specifications.

HardwareActuation

Local Planner

A real-time controller that generates velocity commands to follow the global path while avoiding dynamic obstacles detected by sensors. Local planners (DWA, TEB, MPC-based) operate at high frequency (10–20 Hz) on the local costmap. They balance path tracking accuracy, smoothness, and obstacle avoidance within the robot's kinematic and dynamic constraints.

NavigationPlanning

Localization

Determining the robot's pose (position + orientation) within a known map using sensor observations. Localization methods include: AMCL (particle filter on laser scans), visual localization (matching camera images to a map), GPS/RTK (outdoor), and UWB (indoor). Robust localization is the prerequisite for autonomous navigation — a robot that doesn't know where it is can't plan a path to its goal.

NavigationSLAM

Long-Horizon Manipulation

Multi-step manipulation tasks that require planning and executing a sequence of skills over extended time periods. Examples: cooking a meal, assembling furniture, cleaning a room. Long-horizon tasks are challenging because they require task planning (deciding what to do), skill sequencing (chaining primitives), and error recovery. Hierarchical policies and LLM-based planners are current approaches.

ManipulationPlanning

Loop Closure

Detecting when the robot has returned to a previously visited location, enabling correction of accumulated drift in SLAM. Loop closure detection uses place recognition (visual bag-of-words, learned descriptors, LiDAR scan matching) to identify revisited locations, then triggers a graph optimization that corrects the entire trajectory. Without loop closure, maps progressively distort with distance traveled.

NavigationSLAM

LQR

Linear Quadratic Regulator — an optimal control algorithm that computes state-feedback gains minimizing a quadratic cost function on state deviations and control effort. LQR is widely used for balancing tasks (inverted pendulum, bipedal standing) and trajectory stabilization. It requires a linear dynamics model; for nonlinear robots, the system is linearized around operating points (LQR + gain scheduling).

Control

L2 Regularization

Adding a penalty proportional to the squared magnitude of model weights to the loss function, discouraging large weights and reducing overfitting. L2 regularization (weight decay) is applied to nearly all neural network training. In robot learning, appropriate weight decay prevents the policy from memorizing specific demonstration trajectories.

MLTraining

Layer Normalization

A normalization technique that normalizes across the feature dimension of each individual sample (rather than across the batch, as in batch normalization). Layer normalization is the standard in transformer architectures because it works identically during training and inference and handles variable sequence lengths. It is used in nearly all VLA and policy transformer models.

MLArchitecture

Learning Rate

The step size used by gradient descent to update model parameters. Too high causes divergence; too low causes slow convergence. Learning rate scheduling (warmup + decay) is critical for stable training. Typical initial learning rates: 1e-4 for fine-tuning pre-trained models, 3e-4 for training from scratch with Adam. The learning rate is often the most important hyperparameter to tune.

MLTraining

LoRA

Low-Rank Adaptation — a parameter-efficient fine-tuning method that freezes the pre-trained model weights and injects trainable low-rank matrices into each layer. LoRA dramatically reduces the number of trainable parameters (often 1-10% of full fine-tuning) while achieving comparable performance. It is increasingly used for adapting large VLA models to specific robot embodiments or tasks.

MLTraining

Loss Function

A mathematical function that quantifies the discrepancy between model predictions and ground truth targets. The loss is minimized during training. Common losses in robot learning: MSE (continuous actions), cross-entropy (discrete actions), diffusion loss (denoising score matching), and contrastive loss (representation learning). The choice of loss function directly shapes what the model learns.

MLTraining

LSTM

Long Short-Term Memory — a recurrent neural network architecture with memory cells and gating mechanisms (input, forget, output gates) that can learn long-term temporal dependencies. LSTMs are used in robot policies that need to integrate information over time, such as tasks requiring memory of past observations or multi-step reasoning. They have been largely superseded by transformers for sequence modeling.

MLArchitecture

LEAP Hand

A low-cost, open-source dexterous robot hand designed at CMU for manipulation research. LEAP Hand has 16 DOF across four fingers, using off-the-shelf servo motors. Its open design and low cost (~$2000) make dexterous manipulation research accessible to more labs. RL policies trained in simulation have been successfully transferred to the physical LEAP Hand.

HardwareDexterous

LIBERO

A benchmark suite for lifelong learning in robotic manipulation, providing 130 procedurally generated tasks across 5 task suites of increasing complexity. LIBERO evaluates a policy's ability to learn new tasks without forgetting previously learned ones (continual learning). It uses the RoboSuite simulation framework with a Franka Panda robot.

BenchmarkRobot Learning

Logistics Robot

A robot operating in warehouse, distribution center, or manufacturing environments to move goods, manage inventory, and optimize material flow. Categories include: goods-to-person AMRs (pick station replenishment), sortation robots, autonomous forklifts, palletizing robots, and case-picking manipulators. The logistics robot market is one of the largest segments in commercial robotics.

ApplicationsMobile RoboticsIndustrial

LiDAR Odometry

Estimating robot ego-motion by registering consecutive LiDAR scans using algorithms like ICP, NDT (Normal Distributions Transform), or learned methods (LOAM, LIO-SAM). LiDAR odometry provides accurate drift-free motion estimates in structured environments but can struggle in geometrically degenerate settings (long corridors, open spaces).

VisionNavigationSensors

Loco-Manipulation

The combined task of locomotion and manipulation, where a mobile robot moves through an environment and interacts with objects. Loco-manipulation requires whole-body coordination between the mobile base and the arm(s). It is central to humanoid and mobile manipulator research: navigating to an object, then grasping it while maintaining balance.

LocomotionManipulation

Lazy Grasping

A strategy that exploits environmental constraints (walls, floors, edges) as passive grasp supports, reducing the precision requirements on the robot's active grasp. By backing an object into a corner before grasping, or using gravity to slide it into a consistent pose, lazy grasping achieves robust picking with simple grippers.

ManipulationGrasping

Lead Screw

A screw mechanism that converts rotary motion to linear motion, widely used in linear actuators and Cartesian robot axes. Lead screw efficiency depends on pitch angle: low pitch is self-locking (cannot be back-driven) but less efficient; high pitch is back-drivable but may not hold position without brakes. Ball screws use ball bearings for higher efficiency (90%+ vs. 40% for sliding screws).

HardwareActuation

Latency Compensation

Techniques that account for the time delay between sensing and actuation in a robot control loop. Latency compensation predicts the future state at the time the control action will take effect, enabling stable control despite delays. It is critical in teleoperation (network latency) and visual-servo control (camera processing delay).

ControlSoftwareTeleoperation

Lie Group

A group that is also a smooth manifold, where the group operations are smooth functions. The Special Euclidean group SE(3) (rigid body transforms) and Special Orthogonal group SO(3) (rotations) are the fundamental Lie groups in robotics. Lie group theory provides the mathematical foundation for SLAM, trajectory optimization, and robot dynamics on manifolds.

MathKinematics

Last Mile Delivery

The final stage of package delivery from a local distribution hub to the customer's door. Last-mile delivery robots (Starship, Nuro, Serve) navigate sidewalks or roads autonomously. They are most cost-effective for dense urban deployments. Key challenges include pedestrian navigation, bad weather, and secure package handover.

ApplicationsMobile RoboticsNavigation

Latent Action Representation

Compressing high-dimensional robot actions into a compact latent space using a VAE or VQ-VAE, then training policies to predict latent codes rather than raw actions. Latent action representations reduce the effective action dimensionality, making policy learning easier and enabling better modeling of multi-modal distributions. Used in language-conditioned and video-predicted robot learning.

Robot LearningRepresentation Learning

Learning from Observation

Imitation learning from video demonstrations that show only states (observations) without action labels — the observer sees what was done but not how. LfO algorithms must infer actions from state differences using inverse dynamics models. This setting enables learning from internet robot videos where action labels are unavailable.

Robot LearningImitation Learning

Learning from Play

Imitation learning from unstructured play data — teleoperated robot exploration where the human interacts with the environment without following a specific task. Play data is cheap to collect (no task specification needed) and diverse. PLAY-LMP and other LfP algorithms extract reusable skill libraries from play datasets, enabling goal-conditioned downstream task learning.

Robot LearningImitation Learning

M

Manipulation

Manipulation refers to purposeful physical interaction with objects — picking, placing, assembling, folding, inserting, pouring, and similar tasks. Robot manipulation is one of the most active research areas in embodied AI, because even simple everyday tasks (loading a dishwasher, opening a package) require rich perception, precise motor control, and robust grasp planning. Manipulation difficulty scales from simple pick-and-place with known objects in fixed setups, through contact-rich assembly, to fully dexterous in-hand reorientation with novel objects in unstructured scenes. SVRC's data services specialize in collecting manipulation demonstrations for training and evaluation.

Core ConceptTask

MoveIt

MoveIt is the most widely used open-source motion planning framework for robot arms, originally developed at Willow Garage and now maintained by PickNik Robotics. MoveIt 2 runs on ROS 2 and provides planners (OMPL, CHOMP, PILZ), Cartesian trajectory planning, collision checking against MoveIt's planning scene, kinematics plugins (KDL, IKFast, TracIK), and grasp planning integration. It is the standard middleware layer between a robot learning policy (which outputs desired end-effector poses or waypoints) and the low-level joint controller that executes smooth, collision-free trajectories on the physical robot.

ToolPlanningROS

Multi-task Learning

Multi-task learning trains a single policy on demonstrations from multiple distinct tasks simultaneously, with the expectation that shared representations learned across tasks improve performance on each individual task and enable generalization to new tasks. In robotics, this often means training on hundreds of tasks with varied objects, goals, and environments. The key challenge is balancing the gradient contributions of different tasks (gradient interference) and ensuring the policy can distinguish between tasks at inference time — typically via language conditioning or one-hot task identifiers. Multi-task policies are a prerequisite for general-purpose robotic assistants.

PolicyGeneralizationTraining

Manipulability Ellipsoid

A geometric representation of a robot's ability to move or exert forces in different Cartesian directions at a given configuration. The velocity manipulability ellipsoid is derived from the Jacobian: directions with large ellipsoid axes correspond to easy motion; small axes indicate near-singularity. Yoshikawa's manipulability measure (det(JJᵀ)) is a scalar summary used for configuration optimization.

Kinematics

Meta-Learning

Learning to learn — training a model on a distribution of tasks so that it can rapidly adapt to new tasks with minimal data. In robot learning, meta-learning approaches like MAML, ProMP, and task-conditioned policies enable few-shot adaptation to new objects, environments, or task variations. The model learns an initialization or adaptation strategy that is broadly effective across the task distribution.

Robot Learning

Mobile Manipulation

Combining a mobile base (wheeled, legged, or tracked) with one or more manipulator arms to perform tasks that require both navigation and grasping. Mobile manipulators can reach workspaces that fixed-base arms cannot. Whole-body coordination between the base and arm(s) is a key control challenge. Examples: Mobile ALOHA, Spot with arm, Hello Robot Stretch.

ManipulationNavigation

Model Predictive Control

A control framework that solves an optimization problem at each timestep to find the control sequence that minimizes a cost function over a finite prediction horizon, subject to dynamics constraints. Only the first control action is applied; the problem is re-solved at the next timestep (receding horizon). MPC naturally handles constraints and is increasingly used for legged locomotion and manipulation.

Control

MoveIt

The most widely used open-source motion planning framework for ROS, providing kinematics, collision checking, trajectory planning, and pick-and-place pipelines. MoveIt integrates with OMPL for sampling-based planning (RRT, PRM), supports Cartesian path planning, and provides a user-friendly RViz interface. MoveIt 2 runs on ROS 2.

SoftwarePlanning

MuJoCo

Multi-Joint dynamics with Contact — a fast and accurate physics simulator originally developed by Eman Todorov, now open-source under DeepMind. MuJoCo is the standard simulator for RL-based locomotion and manipulation research due to its speed, stability, and accurate contact modeling. It supports tendons, muscles, and soft contacts. MuJoCo 3 adds GPU-accelerated batch simulation (MJX).

SimulationSoftware

Multi-Task Learning

Training a single model to solve multiple tasks simultaneously, sharing representations across tasks. In robot learning, multi-task models learn a common visual encoder and task-specific heads, or use task embeddings to condition a shared policy. Benefits include improved data efficiency (tasks share useful features) and deployment simplicity (one model serves many tasks).

Robot Learning

Masked Autoencoder

A self-supervised pre-training approach where random patches of an image are masked, and a ViT encoder-decoder learns to reconstruct the missing patches. MAE pre-training learns strong visual representations that transfer well to downstream tasks. In robotics, MAE-pre-trained vision encoders provide robust features for manipulation policies, especially with limited labeled robot data.

MLRepresentation LearningVision

MLP

Multi-Layer Perceptron — a feedforward neural network consisting of fully connected layers with nonlinear activations. MLPs are the simplest deep learning architecture and serve as building blocks within larger models. In robot learning, MLP heads map learned representations to action outputs, and small MLPs serve as policy networks for low-dimensional state spaces.

MLArchitecture

Mixed Precision Training

Training neural networks using a mix of 16-bit (FP16 or BF16) and 32-bit (FP32) floating-point arithmetic to reduce memory usage and increase throughput while maintaining training stability. Mixed precision is standard for training large robot learning models on GPUs — it roughly halves memory consumption and doubles throughput compared to full FP32 training.

MLTraining

Multi-Head Attention

An attention mechanism that runs multiple attention operations in parallel, each with different learned projections, then concatenates their outputs. Multi-head attention allows the model to attend to different types of information (position, color, shape) simultaneously. It is the core computational primitive of transformer architectures used in VLAs and policy transformers.

MLTransformer

Mobile ALOHA

An extension of the ALOHA bimanual teleoperation system mounted on a mobile base, enabling whole-body loco-manipulation tasks. Mobile ALOHA can perform complex household tasks: cooking, cleaning, and door opening. It has been a key platform for demonstrating co-training — training on both Mobile ALOHA and static ALOHA data improves both systems.

HardwareTeleoperationManipulation

Mining Robot

A robot designed for mining operations: drilling, blasting support, hauling, surveying, and emergency response in underground mines. Mining robots operate in GPS-denied, dusty, and structurally hazardous environments. Autonomous haul trucks (by Caterpillar, Komatsu) are the most commercially mature mining robot category, with hundreds deployed in open-pit mines.

ApplicationsMobile Robotics

Model-Based Predictive Control

Model-based control that uses a learned or analytical dynamics model to predict future system states and optimize control inputs over a receding horizon. MBPC combines the data efficiency of model-based methods with MPC's constraint handling. In robot learning, neural network dynamics models enable MBPC for contact-rich manipulation without hand-crafted reward functions.

ControlRobot Learning

Mu-Synthesis

A robust control design method that minimizes the structured singular value μ, providing robustness guarantees against structured uncertainty (parametric variations, unmodeled dynamics) while maintaining performance. D-K iteration is the standard algorithm. Mu-synthesis is used for precision robot control where component tolerances and payload variations create structured uncertainty.

Control

Monocular Depth

Estimating absolute or relative depth from a single RGB image using a neural network. Monocular depth is inherently ambiguous (scale undetermined) without additional priors, but foundation models (Depth Anything, ZoeDepth, Marigold) achieve metric depth estimation by learning from diverse training data. Used where stereo or active depth sensing is unavailable.

VisionSensors

Multi-View Geometry

The mathematical study of how 3D structure and camera motions can be inferred from multiple 2D images. Topics include epipolar geometry, triangulation, structure from motion (SfM), and projective reconstruction. Multi-view geometry is the theoretical foundation of visual SLAM, 3D reconstruction, and camera calibration.

VisionMath

Model Predictive Path Integral

A sampling-based MPC algorithm that generates many (thousands) of noisy rollouts using the system model, evaluates their costs, and combines them using an exponential importance-weighted average to compute the optimal control. MPPI is highly parallelizable on GPU, enabling real-time MPC for complex nonlinear systems. It has been applied to aggressive driving, legged locomotion, and manipulation.

ControlLocomotion

Motion Capture

A system for recording the precise 3D motion of markers attached to a person or robot body. Optical motion capture (Vicon, OptiTrack) tracks reflective or active markers with millimeter accuracy. In robot learning, motion capture provides ground-truth pose references for kinesthetic demonstrations and evaluates tracking accuracy. It is also used to generate reference trajectories for RL.

SensorsHardwareRobot Learning

Mass Estimation

Estimating the mass and inertial properties of a grasped object in real time from joint torques, motor currents, or force-torque sensor readings. Online mass estimation enables adaptive gravity compensation and dynamic control when handling unknown objects. It is performed by swinging the arm and fitting an inertial model to the measured force-torque trajectory.

ManipulationSensorsControl

Multi-Finger Grasp

A grasp using three or more fingers for superior object stability, in-hand dexterity, and force distribution compared to parallel-jaw grasps. Multi-finger grasping enables force closure on irregular objects and supports in-hand manipulation. Planning multi-finger grasps requires solving a combinatorial contact selection problem, typically with sampling-based or optimization methods.

ManipulationGraspingDexterous

Motor Controller

Electronics that convert a digital command signal into power currents/voltages that drive a motor. Robot motor controllers implement: commutation (BLDC), current control (torque), velocity control, and position control loops. EtherCAT servo drives (Elmo, Kollmorgen) and integrated smart servos (Dynamixel, HEBI) are common in research robots.

HardwareElectronicsControl

MQTT

Message Queuing Telemetry Transport — a lightweight publish-subscribe messaging protocol designed for IoT and low-bandwidth networks. In robotics, MQTT is used for fleet management, remote monitoring, and cloud-to-robot communication where ROS's DDS overhead is inappropriate. It provides reliable message delivery over unreliable networks.

Software

Manifold

A topological space that locally resembles Euclidean space. The configuration space of robot joints lies on a manifold (SO(3) for spherical joints, S¹ for revolute joints). Optimization and planning on manifolds requires manifold-aware methods (Riemannian gradient descent, retraction operators) rather than standard Euclidean algorithms.

MathKinematics

Markov Decision Process

A mathematical framework for sequential decision making under uncertainty: a tuple (S, A, P, R, γ) of states, actions, transition probabilities, reward function, and discount factor. RL algorithms learn policies that maximize expected discounted return in an MDP. Robot manipulation and locomotion are modeled as MDPs, where the policy maps states to actions.

MathRL

Monte Carlo Methods

A class of computational algorithms that use random sampling to obtain numerical results, typically for probability estimation or optimization. In robotics: Monte Carlo localization (particle filters) estimates robot pose; Monte Carlo tree search plans in stochastic environments; MC rollouts estimate expected returns in policy optimization.

MathRL

Manufacturing Execution System

A software system that controls, monitors, and optimizes production operations on the factory floor, bridging ERP (business planning) and actual production equipment (PLCs, robots). MES tracks work orders, material consumption, quality data, and OEE (Overall Equipment Effectiveness) in real time, enabling data-driven manufacturing improvement.

ApplicationsSoftwareIndustrial

Medical Imaging Robot

A robot that positions or moves medical imaging devices (X-ray, ultrasound, CT) with precision to acquire consistent images without requiring manual repositioning. Ultrasound robots perform tele-echography, enabling remote expert scanning. Interventional robots combine imaging and therapy: MRI-compatible needle robots for biopsy and tumor ablation.

ApplicationsMedical

Motion Primitive

A parameterized, reusable unit of robot motion: a short trajectory segment or controller that can be composed into longer behaviors. Dynamic Movement Primitives (DMPs) are a classic motion primitive formulation that encodes demonstrated trajectories in a stable attractor system. Motion primitives enable rapid adaptation to new task parameters by adjusting primitive parameters rather than relearning from scratch.

Robot LearningControlPlanning

N

Neural Policy

A neural policy is a robot control policy parameterized by a neural network that maps observations (images, proprioception, language) directly to actions (joint positions, Cartesian deltas, gripper commands). In contrast to classical motion planning pipelines, neural policies learn the mapping end-to-end from data without hand-engineered intermediate representations. Modern neural policies use convolutional encoders for vision, transformers for sequence modeling, and architectures like ACT, Diffusion Policy, or VLA backbones for action generation. A key property of neural policies is that they can be trained from demonstrations or reward signals, enabling them to handle tasks too complex for hand-coded controllers.

PolicyDeep Learning

Non-prehensile Manipulation

Non-prehensile manipulation refers to manipulating objects without grasping them — instead using pushing, rolling, pivoting, flipping, tilting, or other contact strategies that leverage gravity and surface friction. For example, pushing a box across a table to position it, or nudging a peg upright before grasping it. Non-prehensile strategies can move objects into graspable configurations, reposition items too large to grasp, or work in cluttered scenes where a grasp approach is infeasible. Planning non-prehensile actions requires modeling quasi-static or dynamic object mechanics and contact physics, making it an active research topic at the intersection of manipulation and motion planning.

ManipulationPlanning

Nav2

The ROS 2 Navigation Stack — a collection of packages providing autonomous mobile robot navigation: localization (AMCL), costmap generation, global planning (NavFn, Smac), local planning (DWB, MPPI), behavior trees for recovery, and lifecycle management. Nav2 is the standard navigation framework for ROS 2-based mobile robots.

SoftwareNavigation

Null-Space Control

For redundant robots (more DOF than task dimensions), null-space control exploits the extra degrees of freedom to achieve secondary objectives (obstacle avoidance, joint limit avoidance, singularity avoidance) without affecting the primary Cartesian task. The null-space projector maps secondary torques into the subspace of joint motions that produce zero end-effector motion.

ControlKinematics

Normalization Flow

A generative model that transforms a simple base distribution into a complex target distribution through a sequence of invertible transformations. Normalizing flows enable exact density evaluation and efficient sampling. In robot learning, they are used for modeling multi-modal action distributions as an alternative to diffusion models, with the advantage of single-pass generation.

MLGenerative

Nonlinear MPC

Model Predictive Control applied to nonlinear system models, solving a nonlinear program (NLP) at each timestep. NMPC provides optimal control with nonlinear constraint handling but requires real-time NLP solvers. It is used for quadrotor trajectory optimization, legged robot locomotion, and dexterous manipulation where linear approximations are insufficient.

Control

NeRF

Neural Radiance Field — a neural network that represents a 3D scene as a continuous volumetric function mapping (x,y,z, viewing direction) to (color, density). NeRF renders photorealistic novel views by ray marching through the volume. In robotics, NeRF is used for digital twin construction, novel view synthesis for data augmentation, and scene representation for manipulation.

Vision3DRobot Learning

Null-Space Grasping

Exploiting the null space of a redundant robot arm to position the wrist and approach the object from a desired direction, while maintaining the end-effector at the target grasp pose. Null-space grasping uses the extra DOF to optimize approach direction, avoid singularities, and clear obstacles during the approach motion.

ManipulationKinematics

Nuclear Robot

A robot designed for operation in radioactive environments, performing inspection, maintenance, decommissioning, and emergency response at nuclear facilities. Nuclear robots must withstand radiation-induced electronics failures and contamination. Remote teleoperation over radiation-hardened links and radiation-tolerant electronics are essential design requirements.

ApplicationsIndustrialSafety

Neural Process

A meta-learning model that learns a distribution over functions conditioned on context points (observed input-output pairs), enabling fast adaptation to new tasks. Neural Processes combine the representational power of neural networks with the flexibility of Gaussian Processes. In robotics, NPs enable few-shot adaptation of manipulation policies from a few new demonstrations.

Robot LearningML

O

Observation Space

The observation space defines all sensor inputs available to the robot policy at each timestep. Common modalities include RGB images from wrist or overhead cameras, depth maps from structured-light or stereo sensors, proprioceptive state (joint positions, velocities, torques), gripper state, end-effector pose, tactile readings, and task-specification inputs such as language embeddings or goal images. The observation space design profoundly affects policy performance and generalization: richer observations carry more information but increase model complexity, training time, and the risk of overfitting to irrelevant visual features.

PerceptionPolicy

Open-loop Control

Open-loop control executes a pre-planned trajectory without using sensor feedback during execution — the robot simply follows the commanded positions or velocities regardless of what actually happens. This is appropriate for highly repeatable tasks in controlled environments, such as CNC machining or pick-and-place on a fixed conveyor. Open-loop control is fast and simple but fails when disturbances occur, because no corrective action is taken. In contrast, closed-loop (feedback) control continuously compares the actual state to the desired state and applies corrective commands, making it far more robust for robot learning in variable environments.

Control

Open X-Embodiment

Open X-Embodiment (OXE) is a large-scale robot demonstration dataset assembled by Google DeepMind and 33 research institutions, comprising over 1 million robot episodes from 22 different robot embodiments and more than 527 skills. It was created to enable co-training across embodiments — the hypothesis being that diverse robot experience teaches richer manipulation representations than single-robot datasets alone. RT-X, the model trained on OXE, demonstrated positive transfer across embodiments and improved performance on held-out tasks compared to single-embodiment baselines. OXE data is publicly available and has catalyzed a wave of cross-embodiment robotics research.

DatasetFoundation ModelMulti-embodiment

Observation Space

The complete set of sensory inputs available to a robot policy at each timestep. A typical observation space includes: camera images (RGB, depth), proprioceptive state (joint positions, velocities, gripper width), and sometimes force-torque readings, tactile data, or language instructions. The observation space design significantly impacts what information the policy can use for decision-making.

DataRobot Learning

Occupancy Grid

A 2D or 3D grid where each cell stores the probability that it is occupied by an obstacle. Occupancy grids are the standard map representation for mobile robot navigation. They are updated incrementally from LiDAR, depth camera, or sonar measurements using Bayesian updates. Resolution (cell size) trades off map fidelity against memory and computation.

NavigationSLAM

Odometry

Estimating the robot's incremental motion (change in position and orientation) from wheel encoders, IMU, visual features, or LiDAR scan matching. Odometry is the primary input to dead reckoning and SLAM. Wheel odometry drifts due to slip; visual odometry drifts due to tracking errors; combining multiple odometry sources via sensor fusion reduces overall drift.

NavigationSensors

Offline RL

Reinforcement learning from a fixed dataset of previously collected transitions, without any additional online interaction with the environment. Offline RL algorithms (CQL, IQL, TD3+BC) address the distribution shift between the behavior policy that collected the data and the learned policy. Offline RL is attractive for robotics because it avoids the safety concerns and cost of online exploration.

Robot LearningRL

OMPL

The Open Motion Planning Library — a C++ library of sampling-based motion planning algorithms (RRT, RRT*, PRM, EST, KPIECE, and many more). OMPL is integrated into MoveIt and provides a common interface for different planners. It handles arbitrary-dimensional configuration spaces, making it suitable for high-DOF robot arms and mobile manipulators.

SoftwarePlanning

Online RL

Reinforcement learning where the agent actively interacts with the environment, collecting new transitions and updating its policy in real time. Online RL can achieve higher performance than offline RL by exploring regions not covered in static datasets, but requires safe exploration mechanisms in physical robot settings. Sim-to-real transfer is often used to safely conduct the online RL phase in simulation.

Robot LearningRL

Open-Vocabulary Manipulation

The ability of a robot to manipulate objects specified by free-form language descriptions, even objects not seen during training. Open-vocabulary manipulation combines VLMs (for identifying objects from language queries) with manipulation policies. Models like CLIPort, VIMA, and RT-2 demonstrate open-vocabulary capabilities by grounding language in visual observations.

Robot LearningVision-LanguageManipulation

Open3D

An open-source library for 3D data processing: point cloud manipulation, mesh processing, 3D visualization, and 3D reconstruction. In robotics, Open3D is used for processing depth camera outputs, building 3D maps, and implementing perception pipelines for grasping and navigation. It supports both CPU and GPU processing.

SoftwareVision

OpenCV

The most widely used open-source computer vision library, providing functions for image processing, feature detection, camera calibration, object detection, and video analysis. In robotics, OpenCV is the foundation of most visual perception pipelines. Key robotics functions include ArUco marker detection, camera intrinsic/extrinsic calibration, and stereo matching.

SoftwareVision

Operational Space Control

A control framework introduced by Oussama Khatib that formulates robot dynamics and control directly in task (Cartesian) space rather than joint space. It enables intuitive specification of desired end-effector forces and impedances. The approach uses the dynamically consistent generalized inverse of the Jacobian and is the theoretical foundation for Cartesian impedance control.

ControlManipulation

Object Detection

Identifying and localizing objects in images by predicting bounding boxes and class labels. Two-stage detectors (Faster R-CNN) propose regions then classify; single-stage detectors (YOLO, SSD) predict directly. In robotics, object detection identifies graspable objects, obstacles, and task-relevant entities. Real-time detectors (YOLOv8) run at 30+ FPS on edge hardware.

MLVision

Overfitting

When a model performs well on training data but poorly on unseen data, having memorized training-specific patterns rather than learning generalizable features. Overfitting is a critical concern in robot learning where datasets are small. Countermeasures include data augmentation, dropout, weight decay, early stopping, and using pre-trained encoders.

MLTraining

OpenArm Platform

An open-source robot arm platform by Silicon Valley Robotics Center designed for accessible robot learning research. OpenArm provides affordable hardware with integrated data collection capabilities through the SVRC data platform. It is part of the SVRC ecosystem for teleoperation demonstration collection and imitation learning policy training.

HardwareManipulation

Open X-Embodiment

A collaborative dataset aggregating robot manipulation demonstrations from 22+ institutions and 22+ robot types, totaling over 1 million episodes. Created to enable co-training of generalist manipulation policies across diverse embodiments. The dataset powers the RT-X and Octo models and is the largest open robot learning dataset available.

DatasetRobot Learning

OpenVLA

An open-source Vision-Language-Action model built on the Prismatic VLM architecture, fine-tuned on the Open X-Embodiment dataset. OpenVLA predicts discretized robot actions from images and language instructions. As an open-weights model (7B parameters), it enables the research community to study, modify, and build upon VLA technology without proprietary restrictions.

VLARobot Learning

Octo

An open-source generalist robot policy trained on 800K episodes from the Open X-Embodiment dataset. Octo uses a transformer architecture that takes images and language instructions as input and outputs actions for diverse robot embodiments. It can be fine-tuned to new robots and tasks with small amounts of target data, serving as a foundation model for manipulation.

VLARobot Learning

Output Feedback Control

Control based on measured outputs (rather than full state), requiring a state observer or output feedback design. Most real robots are output-feedback systems — joint positions are measured but velocities must be estimated or differentiated. Output feedback control must handle observer dynamics and potential instability from estimated states.

Control

Object Pose Tracking

Continuously estimating the 6-DOF pose of a moving or deforming object across video frames. Unlike one-shot pose estimation, tracking uses temporal continuity and the previous frame's estimate to initialize each new estimate. Methods include PoseCNN+DeepIM, FoundationPose, and particle filter-based approaches. Critical for dynamic grasping and in-hand manipulation feedback.

VisionManipulation

Optical Flow

The apparent motion of pixel intensities between consecutive video frames, caused by relative motion between the camera and scene. Sparse optical flow (Lucas-Kanade) tracks a set of feature points; dense optical flow (RAFT) estimates per-pixel velocities. In robotics, optical flow is used for ego-motion estimation, object tracking, and dynamic obstacle detection.

Vision

ORB Features

Oriented FAST and Rotated BRIEF — a fast, patent-free feature detector and descriptor combining the FAST keypoint detector with the BRIEF descriptor, augmented with orientation estimation for rotation invariance. ORB features are used in ORB-SLAM and other real-time visual SLAM systems because they run on CPU at 30+ FPS.

VisionSLAM

Open-Loop Grasping

Executing a grasp based solely on an initial pose estimate without visual or force feedback during execution. Open-loop grasping is fast but fails when the object moves after the initial estimate or when positioning accuracy is insufficient. It is the baseline approach for fast industrial pick-and-place where objects are tightly constrained.

ManipulationGrasping

Optimization Landscape

The shape of the objective function surface over the parameter space. Understanding the optimization landscape — its local minima, saddle points, and curvature — is important for robot learning. Deep neural networks have complex, high-dimensional optimization landscapes; initialization, architecture, and optimizer choice affect which region of the landscape is explored.

MathML

OEE

Overall Equipment Effectiveness — a manufacturing KPI measuring how effectively a production resource is used: OEE = Availability × Performance × Quality. For robot cells, downtime (unplanned stops), speed losses (running below rated speed), and quality losses (defective parts) all reduce OEE. A world-class OEE is 85%; many manual operations achieve only 40-60%.

ApplicationsIndustrial

Observation History

Including a window of past observations (rather than only the current frame) in the policy input, giving the policy implicit memory for partially observable tasks. N-step observation stacks are simple and effective for tasks requiring velocity estimation from position-only observations, handling brief occlusions, and detecting temporal patterns.

Robot LearningPolicy

Occupancy Map Learning

Learning to predict 3D occupancy maps from sensor observations, either to reconstruct the current scene or predict future scene states. Occupancy map predictions from a single camera frame (learned from paired depth data) enable 3D-aware manipulation without depth cameras. Differentiable occupancy map renderers enable gradient-based optimization.

Robot LearningVision3D

Offline-to-Online RL

A training paradigm that initializes a policy offline from a static dataset, then fine-tunes it online with additional environment interaction. The offline phase provides a good initialization that avoids unsafe early exploration; the online phase improves beyond the offline data's performance. IQL and CQL are common offline phases for robot manipulation.

Robot LearningRL

Online Fine-Tuning

Updating a pre-trained policy in real time using data from ongoing robot deployments. Online fine-tuning enables continuous improvement as the robot encounters new scenarios. Challenges include catastrophic forgetting, safe exploration during fine-tuning, and efficient computation of gradient updates on deployment hardware.

Robot LearningDeployment

P

Payload

Payload is the maximum mass (including the weight of any end-effector and tooling) that a robot arm can carry while maintaining its rated positional accuracy and dynamic performance. Payload specifications typically range from under 1 kg for collaborative research robots (WidowX 250: 250 g) to 500+ kg for large industrial arms. Critically, rated payload is usually quoted at full reach with the arm fully extended; at closer range and more favorable postures, robots can often handle significantly more. Exceeding payload limits degrades accuracy, accelerates wear, and can trigger safety faults or physical damage. SVRC's hardware catalog lists payload for each robot.

HardwareSpecs

Policy (robot)

In robot learning, a policy (denoted π) is a function that maps observations to actions: π(o) → a. The policy is the learned "brain" of the robot that determines what to do at every timestep given what it perceives. Policies can be represented as neural networks (neural policies), decision trees, Gaussian processes, or lookup tables. They can be deterministic (one action per observation) or stochastic (a distribution over actions). Policy quality is measured by task success rate across diverse conditions, not just on training demonstrations. The core challenge of robot learning is training policies that generalize reliably beyond their training distribution.

Core ConceptDeep Learning

Policy Rollout

A policy rollout is a single episode of executing a trained policy on the robot (or in simulation) from an initial state to task completion or timeout. Rollouts are used to evaluate policy performance, collect new data for further training (as in DAgger or RL fine-tuning), and debug failure modes. The number of rollouts needed for reliable performance estimation depends on task variability — high-variance tasks may require 50+ rollouts to get a stable success rate estimate. In research, rollouts are often categorized by initial condition (in-distribution vs. out-of-distribution objects/scenes) to characterize generalization.

EvaluationPolicy

Pre-training

Pre-training is the phase of model development in which a neural network is trained on a large, diverse dataset before task-specific fine-tuning. For robotics foundation models, pretraining may occur on internet-scale vision-language data (images, video, text), cross-embodiment robot datasets (Open X-Embodiment), synthetic simulation data, or a combination. The pretrained model learns rich general representations of objects, actions, and concepts that transfer to downstream robot tasks with far fewer demonstrations than training from scratch. Pre-training is the mechanism behind the success of VLA models such as RT-2, which benefits from both robotic and internet-scale pretraining.

Foundation ModelTrainingTransfer Learning

Parallel Simulation

Running thousands of simulation instances simultaneously on GPU hardware to generate training data at massive scale. NVIDIA Isaac Gym, MuJoCo MJX, and Brax enable training RL policies on billions of timesteps in hours rather than days. Parallel simulation is the key enabler of large-scale sim-to-real transfer for locomotion and dexterous manipulation.

SimulationRL

Parallel-Jaw Gripper

The most common industrial gripper type, consisting of two flat or shaped fingers that open and close in parallel. Parallel-jaw grippers are simple, reliable, and sufficient for pick-and-place of many object types. They grasp objects by squeezing from two sides (antipodal grasp). Actuation is typically pneumatic (fast, binary) or electric (proportional force control).

HardwareGrasping

Path Planning

Computing a collision-free route from a start configuration to a goal configuration. Path planning algorithms include: grid-based (A*, Dijkstra), sampling-based (RRT, PRM), and optimization-based (trajectory optimization). In manipulation, path planning operates in high-dimensional joint space; in navigation, it operates in 2D or 3D workspace. The distinction between path planning and motion planning is that the latter also considers time and dynamics.

NavigationPlanning

Payload

The maximum mass a robot can carry or manipulate at its end-effector while maintaining specified performance (speed, accuracy, repeatability). Payload capacity depends on the robot's structural design, actuator torques, and the load's position within the workspace. It is a primary specification for robot selection. Payload ratings range from grams (micro-assembly) to tons (heavy industrial).

HardwareIndustrial

Peg-in-Hole

The canonical contact-rich assembly task where a cylindrical peg must be inserted into a hole with tight clearance (often <1mm). Successful insertion requires search strategies (spiral, compliance-based), force sensing, and appropriate compliance control. Peg-in-hole is a standard benchmark for force control, impedance control, and RL-based assembly policies.

ManipulationIndustrialBenchmark

Photorealistic Rendering

Generating synthetic images that are visually indistinguishable from real photographs, using ray tracing, physically-based materials, and accurate lighting models. In robotics, photorealistic rendering in simulation reduces the visual domain gap in sim-to-real transfer. NVIDIA Isaac Sim and Blender are the primary tools. The trade-off is computational cost vs. rendering quality.

SimulationVision

Physics Engine

The core numerical engine that computes rigid-body dynamics, collision detection, and contact resolution in a simulator. Different physics engines make different trade-offs between speed, accuracy, and stability. MuJoCo uses a convex solver; PhysX uses a PGS solver; Bullet supports both. The choice of physics engine significantly impacts the fidelity of sim-to-real transfer.

Simulation

Pick and Place

The fundamental manipulation primitive: grasp an object at a source pose and place it at a target pose. Despite its simplicity, robust pick-and-place in unstructured environments requires reliable perception, grasp planning, motion planning, and error detection. Pick-and-place policies must generalize across object shapes, sizes, and materials.

Manipulation

PID Controller

A feedback controller that computes a control signal as the sum of three terms: Proportional (error), Integral (accumulated error), and Derivative (rate of error change). PID is the most widely deployed controller in industrial robotics for joint-level position and velocity regulation. Tuning the three gains (Kp, Ki, Kd) trades off response speed, overshoot, and steady-state accuracy.

Control

Place Recognition

Identifying a previously visited location from current sensor data, without relying on odometry. Visual place recognition uses image retrieval (NetVLAD, VLAD, DBoW2) or learned embeddings; LiDAR-based place recognition uses scan descriptors (Scan Context, OverlapNet). Place recognition is the front end of loop closure in SLAM and the basis of re-localization after tracking loss.

NavigationSLAMVision

Pneumatic Actuator

An actuator powered by compressed air. Pneumatic cylinders and grippers are common in industrial automation for their speed, simplicity, and intrinsic compliance. Soft pneumatic actuators (bellows, McKibben muscles) are central to soft robotics research. Disadvantages include the need for an air compressor, limited position control, and noisy operation.

HardwareActuation

Point Cloud Library

An open-source C++ library for 2D/3D point cloud processing: filtering, segmentation, registration, surface reconstruction, feature extraction, and object recognition. PCL is the standard library for processing LiDAR and depth camera data in robotics. Key algorithms include ICP registration, RANSAC plane segmentation, and Euclidean cluster extraction.

SoftwareVision

Policy Gradient

A family of RL algorithms that directly optimize the policy parameters by estimating the gradient of the expected return with respect to policy parameters. REINFORCE, PPO, TRPO, and SAC all belong to this family. Policy gradient methods work with continuous action spaces (unlike value-based methods like DQN) and are the standard RL approach for robot control tasks.

Robot LearningRL

Pouring

A manipulation skill requiring the robot to tilt a container to transfer its contents (liquid, granules) into a target vessel. Pouring is challenging because it involves fluid dynamics, precise tilt angle control, and visual feedback (monitoring fill level). It is a common benchmark for everyday manipulation and requires different strategies for different viscosities and container shapes.

Manipulation

PPO

Proximal Policy Optimization — a policy gradient RL algorithm that constrains policy updates to a trust region using a clipped surrogate objective. PPO is the default RL algorithm for robot locomotion (legged robots, humanoids) and sim-to-real transfer due to its stability, simplicity, and sample efficiency. It balances exploration and exploitation without the computational cost of TRPO's constrained optimization.

Robot LearningRL

Pre-Training

The initial phase of training a model on a large, general dataset before fine-tuning it on a specific downstream task. In robot learning, visual encoders are pre-trained on ImageNet or internet images (ResNet, ViT, DINOv2), and VLA models are pre-trained on internet-scale vision-language data. Pre-training provides robust feature representations that transfer to robot perception tasks with limited robot-specific data.

Robot LearningRepresentation Learning

Precision

The closeness of agreement between repeated measurements or movements when approaching the same target under the same conditions. Also called repeatability. Industrial robots achieve repeatability of ±0.02–0.1mm. Precision is different from accuracy — a robot can be precise (consistent) but inaccurate (consistently offset from the true target).

HardwareIndustrial

Preference Learning

Learning from human comparative judgments (e.g., 'trajectory A is better than trajectory B') rather than explicit reward signals or demonstrations. A reward model is trained to be consistent with human preferences, then used to optimize the policy via RL. This approach (RLHF applied to robotics) avoids the need for precise scalar reward engineering and can capture nuanced human intent.

Robot LearningRL

Privileged Information

State information available during training (in simulation) but not at deployment time. Examples: exact object poses, contact forces, friction coefficients. Teacher policies trained with privileged information achieve high performance; they are then distilled into student policies that use only deployment-available inputs (images, proprioception). This asymmetric training paradigm is standard in sim-to-real.

Robot LearningSim-to-Real

PRM

Probabilistic Roadmap — a sampling-based motion planning algorithm that pre-computes a graph of collision-free random configurations connected by local paths, then searches the graph for a path from start to goal. PRM is suited for multi-query planning where many different start-goal pairs are queried on the same map. It is complete in the probabilistic sense (finds a path if one exists, given enough samples).

NavigationPlanning

Procedural Generation

Algorithmically creating diverse simulation environments, objects, and scenarios at training time rather than manually designing each one. Procedural generation is used to create large-scale training datasets with varied layouts, object geometries, textures, and task configurations. It is a form of domain randomization applied to scene structure rather than physics parameters.

SimulationRobot Learning

Proximity Sensor

A sensor that detects the presence of nearby objects without physical contact. Technologies include infrared (IR), ultrasonic, capacitive, and inductive types. In robotics, proximity sensors are used for pre-grasp object detection, collision avoidance, and bin-level sensing. They are simpler and cheaper than cameras but provide binary or scalar distance rather than rich spatial data.

SensorsHardware

Push Manipulation

Moving objects by pushing rather than grasping — useful when objects are too large, flat, or heavy to grasp. Push manipulation is inherently non-prehensile and requires reasoning about friction, contact mechanics, and object dynamics. Analytical models (quasi-static pushing) and learning-based approaches (push-to-goal RL) are both active research areas.

ManipulationNon-Prehensile

Panoptic Segmentation

A vision task that unifies semantic segmentation (labeling every pixel with a class) and instance segmentation (distinguishing individual object instances). Panoptic segmentation provides a complete scene understanding: every pixel has both a class label and an instance ID. In robotics, it enables precise object manipulation in cluttered scenes with multiple instances of the same class.

MLVision

Point Cloud Processing

Computing on unordered 3D point sets captured by depth cameras or LiDAR. Architectures include PointNet (per-point MLP + max pooling), PointNet++ (hierarchical), and point cloud transformers. In manipulation, point cloud processing enables 6-DOF grasp prediction, shape estimation, and 3D-aware policies that are robust to viewpoint changes.

MLVision3D

Positional Encoding

A mechanism that injects information about token position into transformer inputs, since self-attention is permutation-invariant and cannot distinguish token order. Sinusoidal and learned absolute encodings are common; rotary positional encoding (RoPE) is the modern standard. In robot learning, positional encodings also encode temporal position within action sequences.

MLTransformer

Pi0

π₀ (pi-zero) — a VLA model by Physical Intelligence that achieves state-of-the-art performance on diverse manipulation tasks. π₀ uses flow matching for action generation and is pre-trained on internet-scale data plus robot demonstrations from multiple embodiments. It demonstrates strong zero-shot and few-shot generalization to new manipulation tasks.

VLARobot Learning

Painting Robot

An industrial robot arm equipped with spray guns for automated painting and coating application. Painting robots provide consistent coat thickness, reduce paint waste (via electrostatic application), and remove human workers from toxic fume exposure. They are standard in automotive production lines and require careful path planning to ensure complete, even coverage.

ApplicationsIndustrial

Palletizing Robot

A robot that stacks boxes, bags, or cases onto pallets according to a specified pattern. Palletizing is one of the highest-volume industrial robot applications. Dedicated palletizing robots (4-axis, high payload) handle repetitive stacking; collaborative robots handle lighter, mixed-SKU palletizing. Optimal pallet patterns maximize stability and packing density.

ApplicationsIndustrialManipulation

Passivity-Based Control

A control design framework based on the passivity properties of physical systems (energy-dissipating systems). PBC designs controllers that shape the system's total energy to achieve the control objective. Passivity guarantees robust stability in the presence of disturbances and unmodeled dynamics. It is widely used for robot impedance control and physical human-robot interaction.

Control

Port-Hamiltonian Control

A geometric control framework that models systems as networks of energy-storing, energy-dissipating, and energy-routing components with port variables (effort and flow). Port-Hamiltonian models compose naturally and preserve passivity. Controllers designed in this framework are structurally robust and physically interpretable. Used in compliant robot and flexible joint control.

ControlMath

Predictive Functional Control

A simplified MPC formulation using a coincidence horizon approach and pre-specified basis functions for the control trajectory. PFC provides MPC-like constraint handling and predictive behavior with much lower computational cost, enabling real-time implementation on slower hardware. It has been applied to industrial robot trajectory following and process control.

Control

Photometric Calibration

Calibrating the response function of a camera to ensure that measured pixel intensities are linearly related to scene radiance, enabling accurate photometric measurements. Photometric calibration improves the reliability of direct visual odometry methods (DSO, LSD-SLAM) that use raw pixel intensities for tracking, which assumes a linear camera model.

VisionCalibration

Place Recognition Network

A neural network trained to recognize previously visited locations from camera images, regardless of viewpoint or lighting changes. NetVLAD (aggregates local features into a global descriptor) and DBoW2 (bag-of-words with ORB features) are the dominant architectures. Place recognition networks power loop closure in visual SLAM and are the backbone of robot relocalization.

VisionSLAM

Point Cloud Registration

Aligning two or more point clouds into a common coordinate frame. ICP (Iterative Closest Point) iteratively minimizes point-to-point or point-to-plane distances. NDT (Normal Distributions Transform) represents the reference cloud as a grid of Gaussian distributions. FGR (Fast Global Registration) provides outlier-robust coarse alignment. Registration underpins SLAM and 3D reconstruction.

Vision3DSLAM

Projective Transformation

A linear transformation of homogeneous coordinates that maps points between projective spaces. In robotics vision, the camera projection matrix P = K[R|t] is a projective transformation mapping 3D world points to 2D image pixels. Projective transformations preserve straight lines (but not distances or angles) and underlie all perspective camera models.

VisionMath

Passive Dynamics

The natural mechanical motion of a robot or mechanism when no actuator torques are applied (free to move under gravity and inertia). Passive dynamic walkers are mechanisms that walk down a slope without actuation, powered only by gravity. Understanding and exploiting passive dynamics leads to energy-efficient locomotion strategies.

LocomotionDynamics

Push Recovery

The ability of a legged robot to maintain balance after receiving an impulsive external force (a push). Push recovery strategies include: ankle strategy (torque at ankle, for small pushes), hip strategy (torso tilting), and stepping strategy (take corrective steps). The push recovery capability of a robot is a standard benchmark for balance controller robustness.

LocomotionControlSafety

Packing

Arranging objects optimally in a container (box, bin, bag) to maximize utilization or satisfy specific packing requirements. Robotic packing involves: detecting items, estimating their poses and dimensions, computing packing layouts, and placing items in precise positions. It is one of the most complex commercial robotics applications, combining planning, vision, and dexterous manipulation.

ManipulationIndustrial

Pinch Grasp

A grasp using the fingertips (as opposed to the full finger surface or palm) to grip small or thin objects. Pinch grasps apply high pressure at small contact areas, suitable for delicate or small items. They require high position accuracy and sensitive force control to avoid slippage or crushing. Fine pinch is a key capability of dexterous hands.

ManipulationGrasping

Placement Precision

The accuracy with which a robot can place an object at a specified location and orientation. Placement precision depends on grasp quality, robot repeatability, and the stability of the placed object. Sub-millimeter precision is required for electronics assembly; centimeter-level is sufficient for bulk material handling.

Manipulation

Post-Grasp Manipulation

Manipulating the grasped object after the initial grasp — reorienting it, repositioning it within the hand, or using it as a tool. Post-grasp manipulation extends the useful configuration space beyond what the initial grasp achieves. It includes in-hand manipulation, regrasping, and object pivoting strategies.

ManipulationDexterous

Primitive Actions

Atomic, low-level manipulation behaviors that serve as building blocks for complex task sequences: reach, grasp, lift, move, place, release, push, flip. Primitive actions have well-defined pre- and post-conditions enabling symbolic task planning. Task and motion planning combines symbolic primitive sequencing with geometric feasibility verification.

ManipulationPlanning

Planetary Gear

A gear system consisting of a central sun gear, surrounding planet gears, and an outer ring gear. Planetary gears achieve high reduction ratios in a compact, coaxial package with high torque density and load sharing across multiple planets. They are used in robot joints, wheel drives, and actuators where compactness and high torque are required.

HardwareActuation

Power Density

The ratio of a motor's maximum output power to its mass or volume. High power density motors enable compact, lightweight robot arms with high payload capacity. Frameless (kit) motors with high-energy-density magnets (NdFeB) achieve the highest power densities. Power density is the key actuator metric for dexterous hands and wearable exoskeletons.

HardwareActuation

Particle Filter

A sequential Monte Carlo method that approximates a probability distribution with a set of weighted samples (particles). Each particle is a possible system state; particles are propagated through the dynamics model and reweighted based on measurement likelihood. Particle filters are used for robot localization (AMCL), tracking, and non-Gaussian state estimation.

MathNavigationSensors

Potential Field

A motion planning method where attractive forces guide the robot toward the goal and repulsive forces push it away from obstacles, combined into a vector field that drives robot motion. Gradient descent on the potential field generates motion commands. Simple to implement in real time, but susceptible to local minima and oscillation near narrow passages.

MathPlanningNavigation

Pseudoinverse

The Moore-Penrose pseudoinverse of a matrix A, denoted A⁺, providing the least-norm solution to the linear system Ax = b when A is not square or not full rank. In robotics, the Jacobian pseudoinverse computes joint velocities from desired Cartesian velocities: q̇ = J⁺(q)ẋ. The damped least-squares pseudoinverse adds a damping term near singularities.

MathKinematicsControl

Pharmaceutical Robot

A robot operating in pharmaceutical manufacturing or dispensing: filling vials, capping, labeling, compounding, dispensing prescriptions, and quality inspection. Pharma robots must meet GMP (Good Manufacturing Practice) standards, often operate in cleanroom environments, and handle potent compounds safely. Accuracy and traceability (track each vial) are critical requirements.

ApplicationsIndustrial

Precision Agriculture

A farming approach using GPS, sensors, drones, and robots to apply inputs (water, fertilizer, pesticides) variably across a field based on measured crop and soil conditions. Precision agriculture robots include: autonomous tractors (GPS-guided field cultivation), seeding robots (variable-rate planting), and crop scouting drones (detecting pests and disease from aerial imagery).

ApplicationsMobile Robotics

Prosthetic Limb

An artificial limb replacing an amputated body part, increasingly integrating robotics for powered actuation. Myoelectric prosthetics control finger and wrist motions via EMG signals from residual limb muscles. Researchers are developing tactile prosthetics that provide sensory feedback to the wearer's nervous system, restoring both motor and sensory function.

ApplicationsHRIHardware

Partial Observability

A setting where the robot's policy cannot observe the full system state — some state information is hidden. Real robot tasks are partially observable: object poses behind other objects, internal forces, and environment properties not directly sensed. POMDPs (Partially Observable MDPs) model this; recurrent policies and history-conditioned policies handle partial observability.

Robot LearningRL

Policy Distillation

Compressing a large or complex policy (teacher) into a smaller, faster policy (student) by training the student to match the teacher's action distribution via KL divergence minimization. Policy distillation is used to: compress RL policies for real-time deployment, transfer from simulation to real hardware, and combine multiple specialized policies into one.

Robot LearningML

Q

Q-function (Action-Value Function)

The Q-function Q(s, a) estimates the expected cumulative discounted reward an agent will receive by taking action a in state s and then following a given policy thereafter. Q-functions are central to reinforcement learning algorithms such as DQN (discrete actions) and SAC, TD3, and DDPG (continuous actions). In robot RL, learning accurate Q-functions for long-horizon manipulation tasks is challenging because rewards are sparse and the state-action space is high-dimensional. Recent work in offline RL (IQL, CQL) uses Q-functions to extract policies from fixed datasets without online interaction, bridging the gap between imitation learning and RL.

Reinforcement LearningValue Function

Quasi-static Manipulation

Quasi-static manipulation assumes that motion is slow enough that inertial and dynamic forces are negligible — the system is effectively in static equilibrium at each instant. This simplification enables tractable contact mechanics modeling for planning pushing, sliding, pivoting, and in-hand regrasping actions. Many robot manipulation benchmarks (including most tabletop pick-and-place tasks) operate in the quasi-static regime. When tasks involve fast throws, dynamic catches, or high-speed assembly, quasi-static assumptions break down and full rigid-body dynamics with contact simulation (e.g., MuJoCo, Isaac Sim) are required.

ManipulationMechanics

Quadruped

A four-legged walking robot, often inspired by dog or cat morphology. Quadrupeds balance stability (can maintain static balance) with agility (dynamic gaits like trotting and bounding). The Unitree Go2, Boston Dynamics Spot, ANYmal, and MIT Mini Cheetah are prominent examples. Quadrupeds are used for inspection, surveillance, rough terrain traversal, and as research platforms for locomotion.

LocomotionHardware

Quasi-Direct Drive

A motor-gear combination using a low reduction ratio (typically 4:1–10:1) to preserve most of the backdrivability of a direct-drive motor while boosting torque output. Popularized by the MIT Cheetah and subsequent legged robots, quasi-direct drives enable proprioceptive force sensing through motor current measurement, eliminating the need for dedicated force-torque sensors.

HardwareActuation

Quaternion

A four-component mathematical representation of 3D rotation: q = (w, x, y, z) where w is the scalar part and (x, y, z) is the vector part. Unit quaternions represent rotations without gimbal lock (unlike Euler angles) and interpolate smoothly via SLERP. They are the standard rotation representation in robot control, ROS transforms, and physics simulators.

KinematicsMath

Quasi-Static Manipulation

Manipulation performed slowly enough that inertial effects are negligible — the system is always near quasi-static equilibrium. Quasi-static analysis simplifies dynamics to statics, enabling analytical computation of contact forces and grasp stability. Most tabletop manipulation experiments are quasi-static; faster motions require dynamic models.

ManipulationControl

Quadratic Program

An optimization problem with a quadratic objective function and linear (or quadratic) constraints. QPs are solved in milliseconds with active-set or interior-point solvers. In robotics: whole-body control formulates priorities as a hierarchy of QPs; CLF-CBF safety filters solve a QP to minimally modify a nominal controller.

MathControlPlanning

Quality Control Robot

A robot that performs automated quality inspection on manufactured parts, detecting defects (scratches, dimensional deviations, assembly errors) using vision, force, or measurement sensors. QC robots replace tedious, error-prone human inspection with consistent, quantitative measurements. Machine vision systems with defect detection models (CNN-based) identify surface defects at production speed.

ApplicationsIndustrialVision

R

Real-to-Sim Transfer

Real-to-sim transfer (the complement of sim-to-real) involves constructing or calibrating a simulation to match the real world as closely as possible — essentially building a digital twin of real conditions. This is used to replay real failure cases in simulation, generate additional synthetic training data matched to real sensor characteristics, and test policy updates safely before deployment. Techniques include photogrammetric scene reconstruction, physics parameter identification (system identification), and neural rendering methods (NeRF, 3D Gaussian Splatting) to match camera appearance. Accurate real-to-sim pipelines dramatically reduce the number of physical experiments needed for policy iteration.

SimulationDigital TwinData

Reach

Reach is the maximum distance from a robot arm's base to any point its end-effector can access within its workspace. For a serial arm, maximum reach equals the sum of all link lengths. Effective reach in a deployment is smaller — accounting for joint limits, self-collision avoidance, and the need to approach objects from multiple orientations. Reach determines which workstation layouts and object placements are feasible. When selecting robots for a task, engineers must confirm that the required workspace (including all approach directions for grasping) falls within the robot's reachable envelope at acceptable accuracy.

HardwareSpecsKinematics

Replay Buffer

A replay buffer (or experience replay memory) is a dataset of past (state, action, reward, next state, done) transitions collected by an RL agent during environment interaction. At each training step, random mini-batches are sampled from the buffer to train the value function or policy, breaking temporal correlations that would destabilize gradient updates. In offline RL and robot learning, the replay buffer is replaced by a fixed dataset of human demonstrations or previously collected rollouts. Prioritized experience replay weights sampling by temporal-difference error to focus training on informative transitions.

Reinforcement LearningData

Reward Function

The reward function defines the learning objective for a reinforcement learning agent: it assigns a scalar reward signal r(s, a, s') to each (state, action, next-state) transition, telling the agent how good or bad its actions are. Reward function design is one of the hardest parts of applying RL to robotics: sparse rewards (1 on success, 0 otherwise) are clean but lead to slow learning; dense rewards (e.g., negative distance to goal) guide learning but can be gamed in unexpected ways (reward hacking). Alternatives include reward learning from demonstrations (IRL, RLHF), task-specific simulation metrics, and learned preference models. Imitation learning sidesteps the reward design problem entirely by learning directly from demonstrations.

Reinforcement LearningCore Concept

Reach

The maximum distance from a robot's base to its end-effector, typically measured as the radius of the reachable workspace sphere. Reach determines the working volume and is a primary factor in robot selection and workcell layout. For articulated arms, reach is roughly equal to the sum of link lengths. Larger reach generally means lower payload capacity for the same robot class.

HardwareKinematics

Real-Time Control

Control systems that must meet strict timing deadlines for each control loop iteration. In robotics, servo-rate control loops run at 1 kHz (1ms period); trajectory planners at 100–500 Hz. Missing a deadline can cause instability or unsafe behavior. Real-time operating systems (RT-PREEMPT Linux, VxWorks) and dedicated control hardware (EtherCAT) ensure deterministic timing.

ControlSoftware

Redundancy Resolution

The process of selecting a specific joint configuration from the infinite set of solutions available to a redundant robot (one with more DOF than the task requires). Common strategies include: minimizing a secondary cost (joint limits, manipulability), gradient projection in the null space, and weighted pseudoinverse methods. Redundancy resolution is critical for humanoids and mobile manipulators.

KinematicsControl

Regrasping

Changing the grasp configuration on an object by placing it down and re-grasping in a different pose, or by using environmental fixtures (edges, surfaces) to rotate the object within the hand. Regrasping extends the effective dexterity of simple grippers and is necessary when the initial grasp is incompatible with the target placement orientation.

ManipulationGrasping

Repeatability

The ability of a robot to return to the same position repeatedly under identical conditions. Unidirectional repeatability measures consistency from one direction; bidirectional includes approach from different directions. ISO 9283 defines the test method. High repeatability (±0.01–0.05mm) is essential for precision assembly, welding, and measurement tasks.

HardwareIndustrial

Representation Learning

Learning useful feature representations from raw data (images, point clouds, proprioception) that capture task-relevant structure. Good representations compress high-dimensional inputs into informative, compact vectors that make downstream policy learning easier. Self-supervised methods (contrastive learning, masked autoencoders), pre-trained vision models, and world models all serve as representation learning approaches for robotics.

Robot Learning

Resolved Rate Control

A velocity-based control method that uses the Jacobian to map desired Cartesian end-effector velocities to joint velocities: dq = J⁻¹(q) · dx. It is the basis of most teleoperation interfaces where a human specifies Cartesian velocity commands. Near singularities, the Jacobian becomes ill-conditioned; damped least-squares (DLS) or Jacobian transpose methods provide numerical robustness.

ControlTeleoperation

Reward Shaping

Modifying the reward function to provide denser, more informative learning signals without changing the optimal policy. For example, adding a potential-based shaping reward that gives partial credit for getting closer to the goal. Reward shaping dramatically accelerates RL training in sparse-reward environments but requires care to avoid introducing unintended local optima.

Robot LearningRL

RGB-D Camera

A camera system that simultaneously captures color (RGB) and depth (D) images, producing aligned color-depth frames. The Intel RealSense D400 series and Azure Kinect are popular in robotics. RGB-D data is the standard input for 6-DOF grasp pose estimation, tabletop segmentation, and point-cloud-based manipulation policies.

SensorsVisionHardware

Risk Assessment

The systematic process of identifying hazards, estimating risk (severity × probability × exposure), and determining appropriate risk reduction measures for a robot system. Risk assessment per ISO 12100 is mandatory for CE marking and is the first step in robot safety engineering. It covers mechanical, electrical, thermal, radiation, and software-related hazards.

SafetyStandards

RLDS

Reinforcement Learning Datasets — a standardized format developed by Google for storing RL and robot learning datasets as TFRecord files with a common schema. RLDS provides dataset builders, loaders, and transformations compatible with TensorFlow Datasets. The Open X-Embodiment dataset and many Google robot learning datasets use RLDS format.

DataSoftware

RoboCasa

A large-scale simulation benchmark for household robot manipulation, providing thousands of procedurally generated kitchen and home environments with realistic object interactions. RoboCasa is designed to evaluate generalist manipulation policies on diverse everyday tasks and is built on the RoboSuite simulation framework.

SimulationBenchmark

RoboSuite

An open-source simulation framework for robot manipulation research, built on MuJoCo. RoboSuite provides standardized robot models (Panda, UR5, IIWA), task environments (pick-and-place, nut assembly, door opening), and a modular API for benchmarking manipulation algorithms. It is widely used in the offline RL and imitation learning communities.

SimulationSoftwareBenchmark

ROS

Robot Operating System — an open-source middleware framework providing tools, libraries, and conventions for building robot software. ROS provides a publish-subscribe messaging system (topics), services, actions, parameter server, and a vast ecosystem of community packages. ROS 1 (Melodic, Noetic) is being superseded by ROS 2 (Humble, Iron, Jazzy) with improved real-time, security, and multi-robot support.

Software

ROS 2

The next generation of the Robot Operating System, built on DDS (Data Distribution Service) middleware for improved real-time performance, security, and multi-robot communication. ROS 2 supports lifecycle management, QoS (Quality of Service) policies, and native multi-platform deployment. Key distributions include Humble (LTS), Iron, and Jazzy.

Software

Rotation Matrix

A 3×3 orthogonal matrix with determinant +1 that represents the orientation of one coordinate frame relative to another. Rotation matrices compose by multiplication (R_total = R2 · R1), are always invertible (R⁻¹ = Rᵀ), and preserve vector lengths. They form the Special Orthogonal group SO(3). In robotics, rotation matrices are the core building block of homogeneous transformations.

KinematicsMath

RRT

Rapidly-exploring Random Tree — a sampling-based motion planning algorithm that incrementally builds a tree by sampling random configurations and extending the nearest tree node toward them. RRT is single-query (one start-goal pair) and excels in high-dimensional spaces. RRT* is an optimal variant that rewires the tree to find shorter paths. Both are widely used in robot arm motion planning.

NavigationPlanning

RT-2

Robotics Transformer 2 — a VLA model from Google DeepMind that fine-tunes a large vision-language model (PaLI-X or PaLM-E) to output robot actions as text tokens. RT-2 demonstrates that internet-scale pre-training enables robots to follow novel language instructions and generalize to unseen objects and scenarios. It represents the paradigm of treating robot action prediction as a vision-language modeling problem.

Robot LearningVLA

RViz

The 3D visualization tool for ROS, used to display robot models, sensor data (point clouds, laser scans, camera images), TF frames, paths, markers, and interactive markers. RViz is the primary debugging and monitoring tool for ROS-based robot systems. RViz2 is the ROS 2 version.

Software

ResNet

Residual Network — a CNN architecture that uses skip connections (residual connections) to enable training of very deep networks (50–152+ layers) by alleviating the vanishing gradient problem. ResNet-18 and ResNet-50 are standard visual encoders in robot learning, providing a good balance of representational power and computational cost for real-time perception.

MLVisionArchitecture

Reward Model

A neural network trained to predict scalar reward values from state-action pairs or trajectory segments, typically trained on human preferences or expert demonstrations. Reward models replace hand-engineered reward functions with learned ones. They are central to RLHF-style training and are used in robotics to specify complex task objectives that are hard to formalize mathematically.

MLRL

RNN

Recurrent Neural Network — a neural network that processes sequential data by maintaining a hidden state that is updated at each timestep. RNNs can, in principle, model arbitrary-length sequences, but vanilla RNNs suffer from vanishing gradients. LSTM and GRU variants address this. In robotics, RNNs provide memory for partially observable tasks but are being replaced by transformers.

MLArchitecture

RLBench

A large-scale reinforcement learning benchmark for robot manipulation built on CoppeliaSim (V-REP). RLBench provides 100 diverse tasks with demonstrations, sparse and shaped rewards, and multiple observation types. Each task has language descriptions, enabling evaluation of language-conditioned and multi-task manipulation policies.

BenchmarkRobot LearningSimulation

RT-1

Robotics Transformer 1 — a transformer-based manipulation policy by Google that processes images and language instructions to output discretized actions. RT-1 was trained on 130K episodes from 13 robots over 17 months. It demonstrated that large-scale real-world data collection enables highly generalizable manipulation policies with 97% success rate on trained tasks.

VLARobot Learning

Rehabilitation Robot

A robot that assists with physical rehabilitation and therapy, providing repetitive, precise movements for patients recovering from stroke, spinal cord injury, or orthopedic surgery. Rehabilitation robots include exoskeletons (HAL, Ekso), end-effector devices (InMotion), and cable-driven systems. They provide consistent therapy intensity and quantitative progress tracking.

ApplicationsHRI

Robust Control

Control design that maintains stability and performance despite uncertainty in the system model or disturbances. Robust controllers (H∞, H2, mu-synthesis) are designed with worst-case guarantees over a specified uncertainty set. In robotics, robust control handles uncertain link masses, variable payloads, and unmodeled joint flexibility without requiring exact system identification.

Control

RANSAC

Random Sample Consensus — a robust estimation algorithm that fits a model to data containing outliers by: (1) randomly sampling a minimal set of data points, (2) fitting the model, (3) counting inliers (points within a threshold), and repeating many times. The model with the most inliers is chosen. RANSAC is standard for homography estimation, fundamental matrix computation, and plane fitting.

VisionMath

Rectification

Transforming stereo image pairs so that corresponding epipolar lines become horizontal and at the same vertical pixel coordinate. After rectification, stereo matching reduces to a 1D horizontal search. Rectification uses the fundamental matrix computed from calibration. Most stereo vision pipelines rectify images before disparity computation.

VisionCalibration

Reduced-Order Model

A simplified, low-dimensional model that captures the essential dynamics of a complex system while ignoring higher-order effects. For legged robots, common reduced-order models include: LIPM (Linear Inverted Pendulum Model), Spring-Loaded Inverted Pendulum (SLIP), and Centroidal Dynamics. These models enable fast planning and control that is then extended to the full-body system.

LocomotionControlDynamics

Running

Dynamic legged locomotion with a flight phase, where all feet leave the ground simultaneously. Running is characterized by higher speed, lower duty cycle, and higher peak ground reaction forces than walking. Bipedal running is particularly challenging due to the small support base and high impact forces at landing. RL-trained policies can learn robust running on complex terrain.

Locomotion

Resolver

An electromagnetic position sensor that outputs two sinusoidal signals in quadrature proportional to shaft angle. Resolvers are inherently robust to contamination, vibration, and extreme temperature — advantages over optical encoders in industrial environments. Resolver-to-digital converters (RDC chips) decode the analog signals into digital position values.

HardwareSensors

Robot Cable

A flexible cable designed for continuous motion (torsion, bending, tensile stress) in robot applications. Robot cables use stranded copper conductors (for flexibility), cross-linked insulation (for heat and abrasion resistance), and braided shields. They are rated for millions of flex cycles, unlike standard cables that would fatigue and fail in robot motion applications.

Hardware

Real-Time Operating System

An operating system that guarantees execution of tasks within specified time deadlines. RTOS kernels (RT-PREEMPT Linux, FreeRTOS, Zephyr, VxWorks) prioritize deterministic scheduling over maximum throughput. In robots, RTOSes run closed-loop servo controllers at 1 kHz with bounded jitter (<10μs), while the non-real-time OS layer handles higher-level planning.

SoftwareControl

Robot Controller

The computing and electronics system that executes the robot's control software, communicates with actuators and sensors, and interfaces with the user and external systems. Industrial robot controllers are proprietary systems (KUKA KRC5, ABB IRC5, Fanuc R-30iB). Research robots use open architectures (PC + EtherCAT + real-time OS).

HardwareSoftwareControl

ROS Bag

A file format for recording and playing back ROS topic data. ROS bags capture timestamped messages from any set of topics (sensor data, control commands, state estimates) for offline analysis, debugging, and dataset creation. ROS 2 uses the .mcap format (rosbag2). Large datasets from robot deployments are often stored as ROS bags.

SoftwareData

Riemannian Geometry

The study of curved (non-Euclidean) spaces equipped with a metric tensor. Robot configuration spaces (SO(3), SE(3)) are Riemannian manifolds. Riemannian optimization methods (geodesic gradient descent, parallel transport) operate directly on these manifolds without coordinate singularities. Relevant for rotation averaging, SLAM on manifolds, and learning on Lie groups.

MathKinematics

Robust Estimation

Statistical estimation methods that are insensitive to outliers in the data. In robotics, sensor measurements frequently contain outliers (spurious LiDAR reflections, incorrect feature matches). Robust estimators (RANSAC, M-estimators like Huber or Cauchy, the Graduated Non-Convexity method) downweight or exclude outliers from state estimation.

MathSensorsSLAM

Remote Presence Robot

A wheeled robot with a display and camera that enables a remote person to navigate and interact in a physical space. Telepresence robots (Double Robotics, Beam) are used for remote office presence, hospital patient visits, and remote site inspection. The remote operator sees through the robot's camera and speaks through its speaker, appearing as a moving avatar.

ApplicationsHRITeleoperation

Robot Nurse

A robot assisting with nursing tasks: patient vital monitoring, medication dispensing, patient transport, and specimen transport. Robot nurses reduce the physical burden on healthcare staff and enable round-the-clock monitoring. Key challenges include patient interaction safety, sterile handling requirements, and navigation in dynamic hospital environments.

ApplicationsMedicalMobile Robotics

Reactive Policy

A policy that maps current observations directly to actions without planning or memory, responding immediately to sensory input. Reactive policies are fast and robust to plan execution errors, but cannot reason about future states. Reflexive behaviors (obstacle avoidance, grasp correction) are implemented as reactive policies, while deliberative behaviors use planning.

Robot LearningPolicyControl

Reward Engineering

The process of designing a scalar reward function that encodes the desired robot behavior for RL training. Good reward engineering provides dense feedback that guides learning without introducing unintended shortcuts. Common pitfalls: reward hacking (optimizing the reward proxy rather than the intended behavior) and sparse rewards (learning signal too infrequent).

Robot LearningRL

Risk-Aware RL

RL that considers not just expected return but also variance or tail risk of outcomes. CVaR (Conditional Value at Risk) optimization minimizes worst-case scenario performance rather than average performance. Risk-aware RL is important for physical robots where rare catastrophic outcomes (hardware damage, human injury) must be avoided even at the cost of average performance.

Robot LearningRLSafety

Robustness to Disturbances

The ability of a robot control policy to maintain successful task completion despite external perturbations (pushes, unexpected contacts, wind), sensor noise, and actuator disturbances. Robustness is improved through: domain randomization (training on varied dynamics), adversarial training, and uncertainty-aware policies. It is a prerequisite for real-world deployment.

Robot LearningControlSafety

S

Sim-to-Real Transfer

Sim-to-real transfer is the process of training a robot policy entirely or primarily in simulation and then deploying it on a physical robot, with the goal that the policy works without (or with minimal) additional real-world data. The core challenge is the reality gap — differences in physics fidelity, visual appearance, sensor noise, and unmodeled dynamics between simulation and the real world. Key mitigation techniques include domain randomization (randomizing simulation parameters during training), system identification (calibrating simulation to match real hardware), and adaptive fine-tuning on small amounts of real data. See the detailed article.

Transfer LearningSimulationDeployment

State Space

The state space is the complete set of configurations a robot and its environment can be in. In RL, the Markov state s encodes all information needed to predict future rewards and state transitions — ideally a complete description of the world. In practice, the agent only has access to partial observations (images, joint angles) that may not fully capture the state (e.g., occluded objects, unknown physics parameters). Designing an observation space that approximates the Markov state well while remaining computationally tractable is a key challenge in robot learning system design.

Reinforcement LearningControl

Surgical Robotics

Surgical robotics applies robot systems to medical procedures, most famously via Intuitive Surgical's da Vinci platform for minimally invasive laparoscopic surgery. Surgical robots provide motion scaling (translating large operator movements to sub-millimeter instrument motion), tremor filtration, and enhanced visualization inside the patient. Emerging research explores autonomous surgical subtasks (suturing, tissue retraction), AI-assisted guidance, and tele-surgery over low-latency 5G links. Regulatory approval (FDA 510(k) or PMA for the US) adds substantial validation burden. Surgical robotics sits at the intersection of teleoperation, HRI, and contact-rich manipulation.

MedicalTeleoperationApplication

SAC

Soft Actor-Critic — an off-policy RL algorithm that maximizes both expected return and policy entropy, encouraging exploration while maintaining stability. SAC is popular for robot manipulation in simulation due to its sample efficiency and robustness to hyperparameters. The entropy regularization prevents premature convergence to deterministic policies and improves robustness to environment variations.

Robot LearningRL

Safety Constraint RL

Constrained reinforcement learning that optimizes return while satisfying safety constraints (e.g., collision probability < 0.01, joint torque limits). Constrained policy optimization (CPO), Lagrangian methods, and control barrier functions are common approaches. Safe RL is critical for real-world robot deployment where constraint violations can damage hardware or harm people.

Robot LearningRLSafety

Safety-Rated Monitored Stop

A collaborative operation mode (ISO 10218) where the robot stops before a human enters the collaborative workspace and remains stationary while the human is present. The robot holds position using safety-rated monitoring of joint positions and velocities. The robot can resume motion when the human leaves the workspace. This is the simplest collaborative mode to implement.

SafetyStandards

Sample Efficiency

The amount of data (environment interactions, demonstrations, or training samples) required to achieve a given level of performance. Sample-efficient algorithms learn faster from less data. In robotics, sample efficiency is crucial because real-world data collection is expensive and slow. Offline RL, model-based RL, data augmentation, and pre-training all improve sample efficiency.

Robot Learning

SCARA Robot

Selective Compliance Assembly Robot Arm — a robot with two revolute joints providing horizontal motion and one prismatic joint for vertical motion. SCARA robots offer high speed and rigidity for planar assembly tasks (inserting, screwing, soldering) with a compact footprint. They are common in electronics manufacturing and light assembly.

HardwareIndustrial

Screw Theory

A mathematical framework that represents rigid-body motions as rotations about and translations along a screw axis. The Product of Exponentials (POE) formula is an alternative to DH parameters for computing forward kinematics, and is considered more elegant and less error-prone. Modern robotics textbooks (Lynch & Park) and libraries (Modern Robotics) favor the screw-theory approach.

KinematicsMath

Self-Play

A training paradigm where agents improve by competing or collaborating with copies of themselves, generating training data through their own interactions. In robotics, self-play is used for multi-agent tasks (competitive object manipulation, adversarial robustness), and for generating diverse behavioral strategies. The agent learns robust policies because its training opponents continuously improve.

Robot LearningRL

Semantic SLAM

SLAM systems that build maps annotated with semantic information (object classes, room types, material properties) in addition to geometric structure. Semantic SLAM combines classical geometric SLAM with object detection and scene understanding. Semantic maps enable task-level reasoning: 'go to the kitchen' rather than 'go to coordinate (3.2, 5.1)'.

NavigationSLAMVision

Serial Robot

A robot arm consisting of a chain of links connected by joints in series, from base to end-effector. Most industrial robot arms (6-axis articulated arms from KUKA, ABB, FANUC, Universal Robots) are serial robots. Serial architectures offer large workspaces and dexterity but have lower rigidity than parallel robots. The most common configuration is 6 revolute joints (6R).

Hardware

Series Elastic Actuator

A compliant actuator that places a known spring element between the motor/gearbox output and the load. By measuring the spring deflection, the actuator can accurately sense and control interaction forces. SEAs are fundamental to many collaborative robot arms and prosthetic limbs. The spring also acts as a mechanical energy buffer, improving impact robustness.

HardwareActuationSafety

Servo Motor

A motor integrated with a position sensor (encoder) and a closed-loop controller that can precisely command position, velocity, or torque. Hobby servos use a potentiometer and PWM control; industrial servos use optical encoders with field-oriented control. Servo motors are the building blocks of nearly every robot arm, from desktop manipulators to industrial 6-axis arms.

HardwareActuation

Sim-to-Real Transfer

Deploying a policy trained entirely in simulation on a physical robot. The simulation-reality gap (differences in physics, rendering, and sensor models) is bridged through domain randomization, system identification, and domain adaptation. Sim-to-real is the dominant paradigm for RL-based locomotion and increasingly for manipulation. Success depends on simulation fidelity and randomization strategy.

Robot LearningSimulation

Singularity

A robot configuration where the Jacobian loses rank, meaning the robot loses one or more degrees of freedom in Cartesian space. At singularities, certain Cartesian velocities become impossible (or require infinite joint velocities). Common singularity types: shoulder singularity, elbow singularity, wrist singularity. Trajectory planners and controllers must detect and avoid or handle singularities.

KinematicsControl

Singularity Avoidance

Techniques for preventing a robot from entering or approaching singular configurations during motion. Methods include: joint-space path planning that circumvents singular regions, damped least-squares Jacobian pseudoinverse (adds a damping term near singularities), and redundancy-based avoidance (using extra DOF to maintain distance from singularities).

ControlKinematics

SLAM

Simultaneous Localization and Mapping — the problem of building a map of an unknown environment while simultaneously tracking the robot's position within it. SLAM is a chicken-and-egg problem: you need a map to localize, and you need localization to build a map. It is solved through probabilistic estimation (filtering or graph optimization). SLAM is fundamental to autonomous mobile robotics.

NavigationSLAM

Sliding Mode Control

A robust nonlinear control technique that forces the system state onto a predefined surface (sliding surface) in state space, then constrains it to slide along that surface to the desired equilibrium. Sliding mode control is insensitive to matched uncertainties and disturbances, making it suitable for robots with uncertain dynamics. The main challenge is chattering — high-frequency switching near the surface.

Control

Snake Robot

A robot with a long, slender body composed of many linked segments that locomotes through undulation, inspired by biological snakes. Snake robots can navigate confined spaces (pipes, rubble, narrow passages) that conventional robots cannot access. Applications include industrial pipe inspection, search-and-rescue, and surgical endoscopy.

HardwareLocomotion

Soft Gripper

A gripper made from compliant materials (silicone, elastomers) that deforms to conform to object shapes. Soft grippers are inherently safe for human interaction and can grasp fragile or irregular objects without damage. Actuation methods include pneumatic inflation, tendon-driven, and shape memory alloys. The trade-off is lower precision and payload compared to rigid grippers.

HardwareGraspingSafety

Soft Robot

A robot made primarily from compliant, deformable materials (silicones, hydrogels, textiles) rather than rigid links and joints. Soft robots can safely interact with humans, conform to irregular surfaces, and squeeze through narrow openings. Actuation methods include pneumatic, cable-driven, electroactive polymers, and shape memory alloys. Soft robotics is an active research frontier.

HardwareSafety

Speed and Separation Monitoring

A collaborative operation mode (ISO 10218) where the robot adjusts its speed based on the distance to the nearest human. As the human approaches, the robot slows down; if the distance falls below a minimum threshold, the robot stops. This mode requires real-time human tracking (via safety-rated LiDAR, 3D cameras, or light curtains) and certified speed limiting.

SafetyStandards

State Estimation

The process of inferring the robot's internal state (joint positions, velocities, contact modes) and external state (object poses, environment geometry) from noisy sensor measurements. State estimators range from simple filters (complementary, Kalman) to complex probabilistic frameworks (particle filters, factor graphs). Accurate state estimation is the foundation of all feedback control.

Robot LearningControlSensors

Stepper Motor

A brushless DC motor that divides a full rotation into equal discrete steps, allowing open-loop position control without an encoder. Stepper motors are inexpensive and easy to control, making them common in 3D printers, CNC machines, and low-cost robot arms. They can lose steps under high load, so closed-loop variants with encoders are used in more demanding applications.

HardwareActuation

Strain Gauge

A resistive sensor whose electrical resistance changes when mechanically deformed. Strain gauges are bonded to structural elements to measure force, torque, or pressure. They are the sensing element in most force-torque sensors and load cells used in robotics. Wheatstone bridge circuits amplify the small resistance changes into measurable voltage signals.

SensorsHardware

Suction Grasping

Grasping objects using vacuum suction cups, which attach to flat or slightly curved surfaces via negative air pressure. Suction grasping is extremely common in logistics and e-commerce fulfillment due to its speed and simplicity — no force closure analysis is needed. It works best on smooth, non-porous surfaces. Dual suction-finger grippers combine suction for flat objects with fingers for irregular ones.

ManipulationGraspingIndustrial

Swarm Robot

A system of many simple, often identical robots that coordinate through local interactions (like swarming insects) to achieve collective behaviors: area coverage, formation control, collective transport. No single robot has global knowledge; emergent intelligence arises from local rules. Applications include environmental monitoring, search-and-rescue, and agricultural inspection.

Mobile Robotics

System Identification

The process of estimating the physical parameters of a real robot or environment (masses, friction coefficients, joint damping, contact stiffness) to improve simulation fidelity. Accurate system identification reduces the sim-to-real gap. Methods include least-squares fitting to recorded trajectories, Bayesian optimization, and neural network-based identification.

SimulationSim-to-Real

Segment Anything

A foundation model for image segmentation by Meta AI that can segment any object in any image given a point, box, or text prompt. SAM provides class-agnostic segmentation masks with zero-shot generalization. In robotics, SAM enables open-vocabulary object segmentation for grasping and manipulation without task-specific training.

MLVision

Semantic Segmentation

Classifying every pixel in an image into a predefined set of categories (floor, table, object, robot). Semantic segmentation provides dense scene understanding for robot navigation (drivable surface detection) and manipulation (separating objects from background). Architectures include U-Net, DeepLab, and SegFormer.

MLVision

Shadow Hand

A highly dexterous 24-DOF anthropomorphic robot hand by Shadow Robot Company, considered the gold standard for dexterous manipulation research. The Shadow Hand closely mimics human hand kinematics with 20 actuated DOF plus 4 coupled joints. It has been the platform for landmark dexterous manipulation results, including OpenAI's Rubik's cube solving.

HardwareDexterous

Spot Arm

An optional 6-DOF manipulator attachment for Boston Dynamics Spot, enabling mobile manipulation tasks. The Spot Arm includes a gripper, wrist camera, and semi-autonomous grasping. It extends Spot's capabilities from pure inspection to physical interaction: opening doors, picking up objects, and operating valves. The arm has 11.3kg payload at full extension.

HardwareManipulationMobile Robotics

SayCan

A framework by Google that grounds large language model (LLM) reasoning in a robot's physical capabilities. The LLM proposes candidate actions based on a language instruction, and value functions (trained via RL) score how likely the robot is to succeed at each action. SayCan combines LLM task planning with physical grounding to enable long-horizon task execution.

Robot LearningVision-LanguagePlanning

Search and Rescue Robot

A robot deployed in disaster environments (collapsed buildings, floods, wildfires) to locate survivors, deliver supplies, and assess structural damage. SAR robots include ground crawlers (snake robots for rubble navigation), aerial drones (thermal imaging for survivor detection), and marine robots. They must operate in degraded environments with limited communication.

ApplicationsMobile Robotics

Surgical Robot

A robot that assists surgeons in performing minimally invasive procedures with enhanced precision, dexterity, and visualization. The da Vinci system (Intuitive Surgical) is the dominant platform, enabling laparoscopic surgery through small incisions with teleoperated instruments. Emerging systems include autonomous suturing, needle steering, and microsurgery robots.

ApplicationsTeleoperation

Sliding Surface

The manifold in state space on which a sliding mode controller constrains the system dynamics. When the state is on the sliding surface, the system dynamics are governed only by the surface equation, making it insensitive to matched uncertainties. The sliding surface is designed to guarantee convergence to the desired equilibrium when sliding occurs.

Control

Scene Graph

A structured representation of a scene as a graph where nodes are objects or regions and edges are relationships (on top of, next to, holds, etc.). Scene graphs support relational reasoning for robot task planning: to place an object, the robot must understand spatial relationships. VLMs can generate scene graphs from images, enabling language-grounded manipulation.

VisionPlanning

Semantic 3D Reconstruction

Building a 3D map annotated with semantic object labels, combining geometric reconstruction with semantic segmentation. Semantic 3D maps enable task-level navigation (go to the table) and manipulation (pick the mug from the shelf). Methods include SLAM++ (object-level SLAM), SemanticFusion (real-time 3D semantic reconstruction with CNN labels), and Open3D-ML.

VisionSLAM3D

Sparse Optical Flow

Tracking a sparse set of salient feature points between consecutive video frames using algorithms like Lucas-Kanade (differential method) or by matching detected features (ORB, SIFT). Sparse optical flow is used in monocular visual odometry, eye-tracking in human-robot interaction, and lightweight motion detection for resource-constrained robots.

VisionSLAM

Stereo Rectification

See Rectification — the process specific to stereo camera systems that transforms both left and right images so corresponding epipolar lines are horizontal and row-aligned, enabling efficient 1D disparity search for depth computation.

VisionCalibration

Structure from Motion

A photogrammetry technique that reconstructs 3D scene structure and camera poses from a set of 2D images with known or unknown camera positions. SfM pipelines: detect features (SIFT/ORB), match features between images, estimate camera poses (PnP, essential matrix), triangulate 3D points, and refine via bundle adjustment. COLMAP is the standard offline SfM tool.

Vision3DSLAM

Stance Phase

The portion of a gait cycle when a leg is in contact with the ground, bearing weight and propelling the robot. During stance, the leg transfers GRF from foot to body. Stance-phase controllers must maintain stable contact (stay within friction cone) while tracking desired CoM trajectory. SEA-based legs measure stance forces directly from spring deflection.

Locomotion

Static Gait

A locomotion pattern that maintains static stability throughout — the projection of the center of mass stays within the support polygon at all times. Walking (with at least three legs always in contact) is statically stable. Static gaits are more energy-intensive but more robust to disturbances than dynamic gaits. They are appropriate for slow-speed, rough terrain traversal.

Locomotion

Step Frequency

The number of steps taken per second (or strides per minute) during locomotion. Step frequency and step length together determine speed. Higher step frequencies enable faster speed but increase metabolic cost and actuator demands. Optimal step frequency minimizes energetic cost of transport. RL policies learn speed-appropriate step frequencies rather than following fixed gait specifications.

Locomotion

Stepping Stone Navigation

A locomotion challenge requiring a legged robot to navigate across a field of discrete, precisely placed stepping stones, requiring exact foot placement. Stepping stone problems stress the accuracy of foot placement control and the integration of perception (stone localization) with footstep planning. They are a standard benchmark for perception-based legged locomotion.

LocomotionNavigation

Swing Phase

The portion of a gait cycle when a leg is in the air, moving from its previous contact point to the next. The swing trajectory determines foot clearance and landing position. Swing-phase controllers must coordinate hip, knee, and ankle joints to achieve the target foothold while avoiding the ground. Impedance control in swing prevents damage from unexpected ground contact.

LocomotionControl

Sensor-Based Grasping

Grasping that uses real-time sensor feedback (tactile, force, visual) to adapt the grasp strategy during execution. Unlike open-loop grasping, sensor-based methods detect and correct errors in approach, contact, and closure phases. Tactile slip detection triggers corrective finger adjustments; force feedback prevents crushing fragile objects.

ManipulationSensorsGrasping

Spatial Reasoning

The cognitive capability to understand, reason about, and plan in 3D space. In robotics, spatial reasoning includes: understanding spatial relationships (on top of, inside, beside), planning collision-free trajectories, and predicting the effects of actions on spatial arrangements. VLMs are increasingly used to provide spatial reasoning for robot task planning.

ManipulationPlanningVision-Language

Slip Ring

An electromechanical device that enables continuous electrical connection between a stationary and a rotating component. Slip rings are used in robot wrists and base joints that can rotate continuously without limit, allowing power and signal cables to pass through without twisting. They introduce electrical resistance and noise — wireless alternatives are preferred where possible.

HardwareElectronics

Strain Wave Gear

Another name for harmonic drive — a gear mechanism using a flexible elliptical cup (flex spline), a rigid circular spline, and an elliptical wave generator to achieve high reduction ratios with zero backlash. The flex spline deflects elastically as the wave generator rotates, engaging successive teeth for smooth, high-precision motion.

HardwareActuation

SPI Protocol

Serial Peripheral Interface — a synchronous serial communication protocol using four lines (MOSI, MISO, SCLK, CS) with full-duplex data transfer. SPI supports very high speeds (up to 100MHz) and is used for high-bandwidth sensor interfaces: IMUs (MPU-6000), ADCs, and display drivers in robot systems. Unlike I2C, SPI supports only one slave at a time per chip-select line.

HardwareElectronicsSoftware

SE3

The Special Euclidean group in 3D — the Lie group of all rigid body transformations (rotations and translations) in 3D space. SE(3) is the fundamental mathematical object for describing robot poses, sensor-to-robot transforms, and object poses. Elements of SE(3) are represented as 4×4 homogeneous matrices or as (quaternion, translation) pairs.

MathKinematics

SO3

The Special Orthogonal group in 3D — the Lie group of all 3D rotations, representing the set of 3×3 orthogonal matrices with determinant +1. SO(3) is the configuration space of a spherical joint. Its Lie algebra so(3) consists of 3×3 skew-symmetric matrices, enabling local parameterization via the exponential map (Rodrigues formula).

MathKinematics

Spline Interpolation

Interpolating a smooth curve through a set of data points using piecewise polynomials (splines) that are continuous and have continuous derivatives at join points (knots). Cubic splines guarantee C2 continuity. In robot trajectory generation, splines interpolate between via-points to create smooth, differentiable joint trajectories within velocity and acceleration limits.

MathPlanning

Stochastic Gradient Descent

An iterative optimization algorithm that updates model parameters using gradients computed on random mini-batches of data. SGD is the foundation of deep learning training. Momentum, adaptive learning rate (Adam, RMSProp), and learning rate scheduling variants improve convergence. In robot learning, mini-batch gradient descent on demonstration datasets trains imitation learning policies.

MathML

Support Vector Machine

A supervised learning algorithm that finds the maximum-margin hyperplane separating two classes in feature space. SVMs with kernel tricks (RBF, polynomial) handle nonlinear classification. In robotics, SVMs are used for contact/no-contact classification from tactile signals and fault detection from vibration signatures. They have largely been superseded by neural networks for complex tasks.

MathML

Symplectic Integration

A class of numerical integration methods that preserve the symplectic structure of Hamiltonian systems (conservation of energy over long simulations). Standard Runge-Kutta methods can accumulate energy drift in robot dynamics simulation; symplectic integrators (Störmer-Verlet, leapfrog) maintain bounded energy errors. Used in physics-based animation and long-horizon robot simulation.

MathSimulation

Smart Factory

A highly digitized manufacturing facility where machines, robots, sensors, and systems are interconnected via IoT and communicate in real time to optimize production. Smart factories integrate digital twins, AI-based quality control, autonomous mobile robots, and predictive maintenance into a self-optimizing production system — the industrial embodiment of Industry 4.0.

ApplicationsIndustrial

Space Robot

A robot designed for operation in space: satellite servicing, planetary exploration, orbital assembly, and astronaut assistance. Space robots must withstand extreme temperatures, radiation, and vacuum, often with communication delays (up to 24 min round-trip for Mars). Key examples include the Mars rovers (Curiosity, Perseverance), Canadarm2, and JAXA's HTV robots.

ApplicationsHardware

Spray Coating Robot

An industrial robot arm equipped with a spray gun for applying coatings (paint, adhesive, sealant, thermal spray) to surfaces. Spray robots optimize coverage uniformity, material consumption, and cycle time. They are standard in automotive painting lines, aerospace primer coating, and consumer electronics lacquering. Path planning optimizes gun speed and distance for uniform film thickness.

ApplicationsIndustrial

Skill Chaining

Learning to compose a sequence of skills (learned sub-policies) to achieve long-horizon goals. Skill chaining requires: skill discovery (identifying natural sub-task boundaries), initiation sets (conditions under which each skill can be started), and termination conditions. Hierarchical RL and option frameworks formalize skill chaining.

Robot LearningPlanning

Skill Library

A repository of reusable, parameterized robot skill primitives (reach, grasp, pour, insert, open) that can be composed by a task planner to achieve complex goals. Skills encapsulate low-level control and can be parameterized (e.g., 'grasp at this pose' or 'pour until full'). Skill libraries grow incrementally as the robot learns new capabilities.

Robot LearningPlanning

Sparse Reward

A reward function that provides signal only at task completion (success/failure), with zero reward at all intermediate steps. Sparse rewards are the most natural specification (binary success) but the hardest for RL to learn from — the agent must discover successful behavior through random exploration. HER, curriculum learning, and reward shaping address sparse reward learning.

Robot LearningRL

State Abstraction

Compressing the full robot and environment state into a compact representation that preserves information relevant to the task while discarding irrelevant details. State abstraction reduces the dimensionality of the learning problem and improves generalization. Learned state abstractions (from autoencoders or contrastive methods) are preferred over hand-crafted ones.

Robot LearningRepresentation Learning

Stochastic Policy

A policy that outputs a probability distribution over actions (rather than a single deterministic action), enabling exploration and handling of aleatoric uncertainty. Gaussian policies (mean + diagonal covariance) are standard in policy gradient RL. Diffusion policies and normalizing flow policies extend to more complex action distributions. Stochastic policies are necessary for maximum entropy RL (SAC).

Robot LearningRLPolicy

Structured Prediction

Predicting structured outputs (sequences, graphs, configurations) where output elements are interdependent and must satisfy global consistency constraints. In robot learning, structured prediction applies to: action sequence generation (temporal consistency), scene graph prediction (relational consistency), and grasp set prediction (pose diversity and feasibility).

Robot LearningML

T

Task-parameterized Learning

Task-parameterized learning encodes demonstrations relative to multiple coordinate frames or task parameters (e.g., the object's pose, a target location, an obstacle frame) rather than in a fixed world frame. When executing, the policy adapts automatically to new object and target configurations without retraining, because it has learned motion relative to task-relevant references. Task-parameterized Gaussian Mixture Models (TP-GMM) and kernelized movement primitives are classical implementations. This approach provides strong geometric generalization for structured pick-and-place tasks, though it requires task frames to be identified and tracked at runtime.

Imitation LearningGeneralizationPolicy

Teleoperation

Teleoperation is the remote control of a robot by a human operator, used both for direct task execution (surgical robots, space robotics, bomb disposal) and as the primary method for collecting high-quality imitation learning demonstrations. In robot learning, a common setup uses a leader-follower architecture: the operator moves a lightweight leader arm and the robot (follower) tracks the leader in real time. VR-based teleoperation systems (using hand tracking or controllers) are increasingly popular as they are more ergonomic and allow higher data throughput. SVRC provides professional teleoperation data collection services for enterprise robot learning teams.

Data CollectionImitation LearningHardware

Trajectory

A trajectory is a time-parameterized sequence of robot states (joint angles or Cartesian poses) that describes how the robot moves from a start configuration to a goal. Trajectories can be generated by motion planners (planning a collision-free path then time-parameterizing it for smooth execution), by teleoperation recording (capturing the operator's motion at a fixed frequency), or predicted directly by a neural policy. Trajectory smoothness and velocity continuity are important for physical robot safety — abrupt discontinuities cause mechanical stress and can trigger safety stops. Trajectory representations include splines, dynamic movement primitives (DMPs), and discrete waypoint sequences.

PlanningControlData

Transfer Learning

Transfer learning in robotics involves taking a model pretrained on one domain (e.g., internet vision-language data, simulation, or a different robot) and adapting it to a target task or robot with limited additional data. Fine-tuning the final layers of a pretrained backbone on robot demonstration data is the most common approach; full fine-tuning all weights is used when sufficient robot data is available. Transfer learning is the mechanism that makes foundation models practical for robotics — the alternative of training from scratch on robot data alone would require millions of demonstrations. See also pre-training, sim-to-real transfer.

Foundation ModelTraining

Tactile Sensor

A sensor that measures contact forces, pressure distributions, or surface textures at the point of physical interaction — typically a robot fingertip or gripper pad. Technologies include resistive (piezoresistive arrays), capacitive, optical (GelSight, DIGIT), and barometric (BPS-based). Tactile sensing enables slip detection, in-hand manipulation, and object recognition by touch.

SensorsHardwareManipulation

Task and Motion Planning

An integrated planning framework that combines symbolic task planning (deciding what actions to take and in what order) with geometric motion planning (finding collision-free paths for each action). TAMP solvers alternate between logical search and geometric feasibility checking. They enable long-horizon manipulation by reasoning about both abstract task structure and physical constraints.

ManipulationPlanning

Teach Pendant

A handheld device used to program and control an industrial robot, providing buttons for jogging (manual motion), recording waypoints, and editing robot programs. Teach pendants are the traditional robot programming interface. Modern pendants feature touchscreens and graphical programming. They are being complemented by tablet apps, AR interfaces, and direct kinesthetic teaching.

HardwareIndustrial

TEB Planner

Timed Elastic Band — a local planning algorithm that optimizes a trajectory (sequence of timed poses) to minimize travel time while satisfying kinematic constraints, obstacle clearance, and dynamic feasibility. TEB naturally handles non-holonomic robots (cars, differential drives) and produces smooth, time-optimal trajectories. It is available as a ROS navigation plugin.

NavigationPlanning

Teleoperation System

A system that enables a human operator to remotely control a robot, typically consisting of a master device (input), communication link, and slave robot (output). Teleoperation systems range from simple joystick control to bilateral force-feedback systems that give the operator a sense of touch. Applications include surgery, bomb disposal, space exploration, and data collection for robot learning.

TeleoperationHardware

Temporal Ensemble

An inference-time technique used with action-chunking policies (like ACT) where overlapping action predictions from consecutive timesteps are averaged to produce smoother, more consistent motion. If the policy predicts 50 future actions at each step, and steps overlap by 40 actions, the 40 overlapping predictions are exponentially weighted and averaged. This reduces jitter and improves execution quality.

Robot LearningPolicy

Terrain Generation

Creating varied ground surfaces (slopes, stairs, gaps, rough terrain, deformable soil) in simulation for training legged robot locomotion policies. Terrain generators procedurally create difficulty-graded environments that teach the robot to handle diverse real-world ground conditions. Terrain randomization is essential for robust quadruped and humanoid locomotion.

SimulationLocomotion

TF (Transforms)

The ROS transform library that maintains the relationship between coordinate frames in a robot system as a tree of time-stamped transforms. TF enables any node to transform data between frames (e.g., camera frame to base frame to world frame). Proper TF tree setup is essential for every ROS robot — incorrect transforms cause perception and control failures.

SoftwareKinematics

Thermal Camera

An infrared imaging sensor that captures heat radiation (typically 8–14 μm wavelength) to produce a thermal image showing temperature distributions. In robotics, thermal cameras are used for predictive maintenance (detecting overheating components), search-and-rescue (finding people in smoke), and food handling (verifying cooking temperature). They work in complete darkness.

SensorsHardware

Time-of-Flight Sensor

A depth sensor that measures distance by emitting modulated light and measuring the phase shift or round-trip time of the reflected signal. ToF sensors provide direct depth measurements at each pixel, unlike stereo cameras that require correspondence matching. They are used in short-range depth cameras (PMD, Azure Kinect) and single-point ranging modules (VL53L series).

SensorsVisionHardware

Tool Changer

A mechanical interface that allows a robot to automatically swap end-effectors (grippers, tools, sensors) without human intervention. Tool changers typically use pneumatic, mechanical, or electromagnetic locking and pass-through connections for pneumatics, electricity, and communication. They enable a single robot to handle multiple tasks requiring different tools.

Hardware

Tool Use

A manipulation skill where the robot uses a tool (hammer, screwdriver, spatula, brush) to act on the environment, extending its functional capabilities beyond what its bare end-effector can achieve. Tool use requires understanding tool affordances, maintaining a stable grasp during force application, and adapting control to the tool's dynamics.

Manipulation

Torque Sensor

A sensor that measures rotational force (torque) applied to a shaft or joint. In collaborative robots, joint torque sensors enable collision detection, gravity compensation, and compliant control without external force-torque sensors. Strain-gauge-based torque sensors are integrated into each joint of robots like the Franka Emika Panda and KUKA iiwa.

SensorsHardwareControl

Trajectory Optimization

Computing an optimal trajectory (sequence of states and controls) that minimizes a cost function subject to dynamics constraints. Methods include shooting (optimize controls only), collocation (jointly optimize states and controls), and differential dynamic programming (DDP/iLQR). Trajectory optimization is used for motion planning, model predictive control, and generating expert demonstrations for imitation learning.

Robot LearningControlPlanning

Transporter Networks

A vision-based manipulation architecture that learns pick-and-place policies by predicting dense pixel-wise correspondences between source (pick) and target (place) locations. Transporter Networks use spatial attention to detect where to pick and cross-correlation to determine where to place. They are highly sample-efficient and effective for precise rearrangement tasks.

ManipulationRobot LearningVision

Tokenization

Converting continuous or structured data into discrete tokens for processing by transformer models. In VLAs, tokenization applies to: text (BPE tokenization into subword tokens), images (patch tokenization into visual tokens), and actions (discretization into action bins or clustering into action tokens). The tokenization scheme significantly impacts model capacity and inference speed.

MLTransformer

Transfer Learning

Applying knowledge gained from one task or domain to improve performance on a different but related task. In robot learning, transfer learning includes: pre-trained visual encoders (ImageNet → robot vision), sim-to-real transfer, cross-embodiment transfer, and fine-tuning VLAs on specific robot tasks. Transfer learning is essential because robot-specific data is scarce.

MLRobot Learning

Transformer Architecture

A neural network architecture based on self-attention mechanisms, originally designed for NLP but now dominant across modalities. Transformers process sequences of tokens in parallel, computing attention between all pairs. In robotics, transformers are used for: policy networks (decision transformer, ACT), visual encoders (ViT), and VLA models that unify vision, language, and action.

MLArchitectureTransformer

TD3

Twin Delayed DDPG — an off-policy RL algorithm that addresses DDPG's overestimation bias by: (1) using two Q-networks and taking the minimum, (2) delaying policy updates relative to Q-network updates, and (3) adding smoothed target policy noise. TD3 is more stable than DDPG and is a standard baseline for continuous-control robot learning tasks.

TRPO

Trust Region Policy Optimization — a policy gradient RL algorithm that constrains each policy update to a trust region (measured by KL divergence) to prevent large, destructive updates. TRPO provides monotonic improvement guarantees but is computationally expensive due to the constrained optimization. PPO approximates TRPO's benefits with a simpler clipped surrogate objective.

Terminal Sliding Mode

A sliding mode control variant that uses a nonlinear sliding surface with fractional power terms, guaranteeing finite-time (rather than asymptotic) convergence to the equilibrium. Terminal sliding mode ensures the tracking error reaches exactly zero in finite time — desirable for high-precision robot positioning. The nonlinear surface avoids the infinite settling time of linear sliding surfaces.

Control

Transformer Visual Backbone

Using a Vision Transformer (ViT) as the feature extraction backbone in a computer vision pipeline. Compared to CNN backbones, ViT backbones model long-range dependencies between image patches via self-attention, providing superior performance on large-scale datasets. They are the default backbone in modern detection (DINO-DETR), segmentation (Mask2Former), and VLA models.

VisionML

Thermal Coefficient of Expansion

The fractional change in a material's dimensions per degree of temperature change. TCE mismatch between materials causes stress and dimensional errors in precision robot structures and sensors. High-precision robots use materials with matched TCE (e.g., Invar for structural members paired with steel components) to maintain calibration across temperature ranges.

HardwareMaterials

Titanium

A lightweight, high-strength metal used in robot structures where low weight, high strength, and corrosion resistance are required. Titanium's specific strength is superior to steel, making it ideal for robot arms (reducing inertia), surgical robots (biocompatibility), and aerospace robots. It is more expensive and harder to machine than aluminum or steel.

HardwareMaterials

Torque Ripple

Periodic fluctuation in a motor's output torque as a function of rotor position, caused by cogging (interaction between rotor magnets and stator slots) and commutation imperfections. Torque ripple causes vibration and position oscillations, degrading control quality for smooth motion tasks. Mitigation techniques include skewed stator slots, multi-pole designs, and torque ripple compensation in the controller.

HardwareActuationControl

Telemetry

Remote monitoring of robot state data (joint positions, velocities, temperatures, power consumption, error codes) transmitted wirelessly or via network to an operator's console. Telemetry enables fleet operators to monitor many deployed robots simultaneously, detect anomalies early, and collect operational data for system improvement and predictive maintenance.

SoftwareDeployment

Time Synchronization

Aligning the clocks of distributed robot system components (sensors, controllers, actuators, planning computers) to a common time reference. Precise time synchronization is essential for data fusion (aligning sensor timestamps) and control (coordinating distributed servo loops). PTP (Precision Time Protocol, IEEE 1588) achieves sub-microsecond synchronization over Ethernet.

SoftwareControlSensors

Tensor Decomposition

Factorizing a multi-dimensional array (tensor) into simpler components (CP, Tucker, or tensor train decomposition). In robotics, tensor decomposition is used for: compressing large neural network weight tensors for edge deployment, analyzing multi-modal sensor data, and efficient computation of robot dynamics quantities.

MathML

Total Variation

A measure of signal variation that quantifies the sum of absolute differences between adjacent values. Total variation regularization is used for image denoising (preserving edges), trajectory smoothing (penalizing jerk), and sparse signal recovery. TV-L1 optical flow algorithms use TV regularization to produce sharp, piecewise-smooth motion fields.

MathVision

Tire Handling Robot

A specialized industrial robot for handling, mounting, and balancing automotive tires. Tire handling robots manage heavy, deformable objects in high-volume automotive manufacturing. They must adapt to various tire sizes and manage the viscoelastic behavior of rubber during mounting. Integrated force-torque sensing prevents bead damage during rim mounting.

ApplicationsIndustrial

Task Specification

The method used to communicate what task a robot should perform. Specifications include: reward functions (RL), demonstrations (imitation learning), goal images, natural language instructions (VLA), and formal task descriptions (PDDL). The choice of task specification determines what the robot can generalize to and how easily new tasks can be specified.

Robot LearningPlanning

Temporal Abstraction

Representing and reasoning at multiple timescales — enabling long-horizon planning by abstracting away low-level details. The options framework (semi-MDPs with sub-policies and termination conditions) is the formal framework for temporal abstraction in RL. Temporal abstraction enables efficient credit assignment across many timesteps in long-horizon manipulation tasks.

Robot LearningRLPlanning

Tool-Augmented Policy

A robot policy that can select and use tools (screwdriver, spatula, brush) from a toolkit to accomplish tasks beyond its bare-gripper capability. Tool-augmented policies require understanding tool affordances (what each tool does), tool grasping, and switching between tools mid-task. LLMs and VLMs are used for tool selection based on task requirements.

Robot LearningManipulation

Trajectory Dataset

A dataset of robot trajectories (sequences of states, actions, and optionally rewards) used to train imitation learning or offline RL policies. The quality, diversity, and size of trajectory datasets are the primary determinants of learned policy performance. Large-scale trajectory datasets (Open X-Embodiment, Bridge Data) have enabled generalist manipulation policies.

Robot LearningData

U

URDF (Unified Robot Description Format)

URDF is an XML-based file format that describes a robot's kinematic and dynamic properties: links (rigid bodies with mass, inertia, and visual/collision meshes) and joints (the connections between links, with type, axis, limits, and damping parameters). URDF is the standard robot description format in ROS and is supported by all major simulation platforms (Isaac Sim, MuJoCo, Gazebo, PyBullet). It enables loading the robot's kinematics into motion planners like MoveIt, visualizing the robot in RViz, and instantiating physics simulation models. XACRO (XML macro language) is commonly used to parameterize and modularize URDF files for complex robots. OpenArm and most SVRC hardware have publicly available URDF models.

ToolStandardSimulation

Underactuated Gripper

A gripper with fewer actuators than degrees of freedom, using mechanical linkages, springs, or differential mechanisms to distribute motion across joints. Underactuation enables passive adaptation to object shape without complex control. The Robotiq 2F-85 and 3-Finger grippers are notable underactuated designs widely used in research and industry.

HardwareGrasping

Underwater Robot

A robot designed for subaquatic operation: ROVs (remotely operated vehicles), AUVs (autonomous underwater vehicles), and underwater manipulators. Underwater robots face unique challenges: water pressure, corrosion, limited communication (no RF, acoustic only), and 6-DOF dynamics including buoyancy. Applications include ocean exploration, pipeline inspection, and marine biology.

HardwareMobile Robotics

U-Net

An encoder-decoder CNN with skip connections between corresponding encoder and decoder layers, originally designed for medical image segmentation. The skip connections preserve fine spatial details lost during downsampling. U-Net architectures are used in robot perception for dense prediction tasks: depth estimation, segmentation, and affordance prediction.

MLVisionArchitecture

Unitree G1

A compact humanoid robot by Unitree Robotics, standing 127cm and weighing 35kg. The G1 features 23 DOF with joint motors capable of high-speed motion. It is designed as an affordable research and commercial humanoid platform. The G1 supports dexterous hands as optional accessories and targets service, education, and research applications.

HardwareHumanoid

Unitree Go2

A consumer-grade quadruped robot by Unitree, featuring 12 DOF, built-in LiDAR, cameras, and an AI computing module. The Go2 weighs 15kg and runs for 1-2 hours. It supports RL-trained locomotion, autonomous navigation, and a developer SDK. The Go2 is popular in academic research as an affordable quadruped platform.

HardwareLocomotion

Unitree H1

Unitree's full-size humanoid robot standing 180cm and weighing 47kg, with 19 DOF. The H1 targets industrial and commercial applications requiring human-form-factor robots. It features joint motors with high torque density and supports various dexterous hand attachments. H1 has been used to demonstrate RL-trained humanoid locomotion and manipulation.

HardwareHumanoidLocomotion

Universal Robots UR5

A collaborative 6-DOF robot arm by Universal Robots with 5kg payload and 850mm reach. The UR5 is the best-selling collaborative robot worldwide, known for its ease of programming (Polyscope teach pendant), built-in force sensing, and affordable price point. It is widely used in manufacturing (machine tending, assembly, packaging) and research.

HardwareIndustrialManipulation

Underwater Inspection Robot

An ROV or AUV designed to inspect underwater infrastructure: subsea pipelines, offshore platforms, ship hulls, and dam walls. These robots carry cameras, sonar, and corrosion sensors. They operate at depths and durations impossible for human divers. Autonomous inspection (planned survey paths with anomaly detection) is an active development area.

ApplicationsMobile Robotics

Unstructured Environment

An environment where object locations, types, and configurations are not known in advance and change unpredictably. Robots in unstructured environments must robustly perceive, plan, and act without relying on fixtures, part feeders, or controlled workspace conditions. Most real-world robot applications are unstructured, in contrast to fixed industrial cells.

ManipulationNavigation

UART

Universal Asynchronous Receiver-Transmitter — a serial communication protocol that transmits data asynchronously byte-by-byte at a negotiated baud rate. UART is the simplest serial interface and is ubiquitous in robotics microcontrollers for: GPS receivers, radio modems, motor controllers, and debug consoles. Common baud rates: 9600 to 921600 bps.

HardwareElectronicsSoftware

UV Disinfection Robot

An autonomous mobile robot that uses ultraviolet-C (UV-C) light to disinfect surfaces in hospitals, airports, and public spaces. UV disinfection robots navigate autonomously, stop near target surfaces, and activate UV lamps to inactivate pathogens. They supplement but do not replace manual cleaning. Safety systems prevent UV exposure to humans during operation.

ApplicationsMedicalMobile Robotics

Uncertainty Quantification

Estimating the confidence of a robot system's predictions or decisions — distinguishing epistemic uncertainty (due to limited data) from aleatoric uncertainty (due to irreducible randomness). Bayesian neural networks, Monte Carlo dropout, deep ensembles, and conformal prediction quantify uncertainty. Uncertainty-aware robots can request human help when confidence is low.

Robot LearningSafety

V

VLA (Vision-Language-Action Model)

A Vision-Language-Action model is a neural network that jointly processes visual observations (RGB images), natural language instructions, and robot proprioception to produce action outputs. VLAs extend large vision-language models (VLMs such as PaLM-E, LLaVA, or Gemini) by adding an action head — training the model to output robot joint positions or end-effector deltas alongside its language predictions. Notable VLAs include RT-2 (tokenizes actions as text tokens and fine-tunes a VLM), OpenVLA (open-source, 7B parameter, trained on Open X-Embodiment), and pi0 (Physical Intelligence's flow-matching VLA). See the VLA and VLM article and the SVRC model catalog.

Foundation ModelLanguageCore Concept

ViperX

ViperX is a series of 6-DOF robot arms manufactured by Trossen Robotics, widely used in academic robot learning research due to their low cost, ROS support, and compatibility with the DYNAMIXEL servo ecosystem. The ViperX 300 (with 300 mm reach) and ViperX 300-S are among the most common research arms found in imitation learning setups and are the follower arms in the original ALOHA system. ViperX arms have modest payload (~750 g) and accuracy compared to industrial robots but offer an accessible entry point for manipulation research. Browse SVRC's hardware store for availability.

HardwareResearch Robot

Visual Servoing

Visual servoing uses camera feedback in a closed-loop controller to guide a robot toward a goal defined in image space (Image-Based Visual Servoing, IBVS) or 3D space estimated from images (Position-Based Visual Servoing, PBVS). In IBVS, the controller minimizes the error between detected image features (keypoints, object bounding boxes) and their desired positions in the image plane, without explicitly computing 3D poses. Visual servoing is attractive because it directly compensates for calibration errors and camera-robot misalignment. Modern deep learning variants train neural networks to output servoing velocity commands directly from raw images, enabling robust alignment to novel objects.

ControlPerceptionClosed-loop

Vacuum Gripper

An end-effector that uses negative air pressure to attach to object surfaces via suction cups or foam pads. Vacuum grippers excel at picking flat, smooth objects (boxes, sheets, panels) and are the fastest grasping method in logistics. They require a vacuum generator (venturi or pump) and work best on non-porous surfaces.

HardwareGraspingIndustrial

Value Function

A function that estimates the expected cumulative reward from a given state (state-value function V) or state-action pair (action-value function Q). Value functions are central to RL — they guide action selection and policy improvement. Deep RL uses neural networks to approximate value functions. In robotics, learned value functions can serve as cost-to-go estimates for planning and as safety critics.

Robot LearningRL

Variable Stiffness Actuator

An actuator that can dynamically adjust its mechanical stiffness in real time, independent of its position. VSAs can be stiff for precision tasks and soft for safe interaction, mimicking how humans modulate muscle co-contraction. Research VSA designs include antagonistic springs, cam-lever mechanisms, and magnetorheological approaches.

HardwareActuationSafety

Velocity Kinematics

The mathematical relationship between joint velocities and end-effector velocities, expressed through the Jacobian matrix. Velocity kinematics is essential for real-time Cartesian velocity control, teleoperation, and singularity analysis. The Jacobian maps joint velocities to Cartesian velocities (forward) and Cartesian velocity commands to joint velocities (inverse).

Kinematics

Visual Odometry

Estimating the robot's incremental motion by tracking visual features across consecutive camera frames. Visual odometry (VO) computes the relative pose change from feature matches using epipolar geometry (for stereo) or PnP (for monocular with depth). It provides smooth, drift-accumulating motion estimates that are fused with other sensors in a SLAM or state estimation pipeline.

NavigationVisionSLAM

Visual SLAM

SLAM using cameras (monocular, stereo, or RGB-D) as the primary sensor. Visual SLAM systems (ORB-SLAM3, VINS-Mono, RTAB-Map) extract and track visual features to estimate camera pose and build a 3D map of landmarks. Visual SLAM works in GPS-denied environments and provides rich geometric and appearance information, but is sensitive to textureless scenes and lighting changes.

NavigationSLAMVision

VLA

Vision-Language-Action model — a foundation model that takes visual observations and language instructions as input and directly outputs robot actions. VLAs unify perception, language understanding, and control in a single neural network. Examples include RT-2, OpenVLA, Octo, and π0. VLAs represent the current frontier of generalist robot learning, aiming to be the 'GPT moment' for robotics.

Robot LearningVLAVision-Language

Voice Coil Actuator

A direct-drive linear actuator based on the Lorentz force principle — a coil in a magnetic field produces linear force proportional to current. Voice coils offer extremely fast response (<1 ms), zero friction, and sub-micron precision. They are used in hard-drive heads, optical focusing, and high-bandwidth force control in dexterous manipulation research.

HardwareActuation

Voxel Grid

A 3D extension of occupancy grids where space is divided into cubic cells (voxels), each storing occupancy or other information. Voxel grids are used for 3D mapping, collision checking, and point cloud downsampling. OctoMap uses an octree-based voxel representation for memory-efficient 3D occupancy mapping. Resolution is typically 1–10 cm for indoor robots.

NavigationSLAM

VAE

Variational Autoencoder — a generative model that learns a probabilistic latent space by encoding inputs into a distribution (mean + variance) and decoding samples from this distribution. The ELBO objective balances reconstruction quality and latent space regularity. VAEs are used in robot learning for: latent state representation, ACT policy architecture (CVAE encoder), and generative world models.

MLGenerative

Vision Transformer

An image classification architecture that splits images into fixed-size patches, linearly embeds each patch, adds positional encoding, and processes the sequence with a standard transformer encoder. ViT has become the dominant visual backbone, surpassing CNNs when pre-trained on large datasets. ViT variants (DeiT, Swin, DINOv2) are the default visual encoders in modern VLA models.

MLVisionTransformer

ViperX 300

A 6-DOF robot arm by Trossen Robotics based on Dynamixel servo motors. The ViperX 300 is used as a teleoperation follower arm in the ALOHA system and as a standalone research manipulator. It offers moderate payload (750g), ROS support, and open kinematics. Its affordable price (~$5000) makes it popular in academic manipulation research.

HardwareManipulation

Visual Place Recognition

Identifying the location of a robot by recognizing a previously seen place from current camera observations. VPR systems retrieve the most similar stored image from a map database. Methods range from hand-crafted (DBoW2, VLAD) to learned (NetVLAD, AnyLoc). VPR is the primary loop-closure trigger in visual SLAM and enables global relocalization.

VisionSLAM

Voxel Hashing

A memory-efficient 3D representation that stores only occupied voxels in a hash table (keyed by integer 3D coordinates), rather than allocating the full 3D grid. Voxel hashing enables real-time, incremental 3D reconstruction at high resolution (1-5mm) from RGB-D data. InfiniTAM and Open3D implement voxel hashing for robot mapping.

Vision3DSLAM

Virtual Model Control

A whole-body control approach that specifies desired virtual forces and torques acting on the robot's body (as if virtual springs and dampers were attached) and maps these to joint torques via the robot's Jacobian. VMC provides intuitive force specifications for legged locomotion and has been applied to MIT Cheetah for dynamic running and jumping.

LocomotionControl

Vineyard Robot

An agricultural robot specialized for grape growing operations: pruning, harvesting, berry sampling, and spraying. Vineyard robots must navigate narrow row spacing, variable terrain slopes, and dense vine canopies. GPS-based navigation and computer vision for vine detection guide autonomous operation. They address the severe labor shortage in wine regions.

ApplicationsAgricultural

Video Prediction Model

A generative model that predicts future video frames from current frames and optionally action sequences. Video prediction models serve as world models in robotics: the robot can imagine the consequences of its actions before executing them. SV2P, FitVid, and UniSim are video prediction models applied to robot manipulation planning.

Robot LearningSimulation

Visual Imitation

Imitation learning from video demonstrations captured from third-person or first-person viewpoints, without proprioceptive information. Visual imitation leverages the abundance of instructional videos online (cooking, assembly, sports) to learn robot policies. Key challenges include correspondence between human and robot morphology and action space inference from observation only.

Robot LearningImitation LearningVision

W

Waypoint

A waypoint is an intermediate configuration (joint angles or Cartesian pose) that a robot's trajectory must pass through on the way from start to goal. Waypoints allow programmers and planners to guide the robot's path through specific poses — for example, to avoid an obstacle, approach an object from a safe direction, or sequence through a multi-step assembly procedure. In robot learning, high-level policies sometimes output waypoints that a lower-level motion planner interpolates into smooth joint trajectories, combining the generalization benefits of learned policies with the safety guarantees of classical planning.

PlanningTrajectory

Whole-body Control (WBC)

Whole-body control coordinates all joints of a legged or humanoid robot simultaneously to satisfy multiple competing objectives — maintaining balance, tracking end-effector targets, avoiding joint limits, and managing contact forces — solved as a real-time constrained optimization problem (typically a QP). WBC is essential for humanoids and legged manipulators because the base is not fixed: arm motion shifts the center of mass and must be compensated by leg and torso adjustments. WBC frameworks like Drake, Pinocchio, and OCS2 are commonly used in humanoid research. The Mobile ALOHA platform and Boston Dynamics Atlas rely on whole-body controllers for loco-manipulation. See WBC article.

ControlHumanoidLocomotion

Workspace

A robot's workspace is the set of all positions (and orientations) that the end-effector can reach given the robot's kinematic structure and joint limits. The reachable workspace is all positions the end-effector can reach in at least one orientation; the dexterous workspace is the smaller subset reachable in every orientation — the most useful region for manipulation tasks requiring arbitrary approach angles. Workspace analysis informs cell layout (how far apart robots and parts should be), robot selection (matching reach to task layout), and motion planning (identifying singularity-free paths across the workspace).

KinematicsHardwarePlanning

Whole-Body Control Framework

A hierarchical control architecture for robots with many DOF (humanoids, mobile manipulators) that formulates multiple tasks as a prioritized stack of quadratic programs. Higher-priority tasks (balance, collision avoidance) are strictly enforced; lower-priority tasks (end-effector tracking, posture) are achieved in the null space of higher ones. Solvers like HQP and OSC are common implementations.

ControlHumanoid

Workspace

The set of all positions (and optionally orientations) that a robot's end-effector can reach. The reachable workspace includes all positions achievable in at least one orientation; the dexterous workspace includes only positions achievable in all orientations. Workspace shape depends on kinematic structure, joint limits, and link lengths. Workspace analysis guides robot selection and cell layout.

KinematicsHardware

Workspace Analysis

The systematic study of a robot's reachable and dexterous workspace, including workspace volume, shape, singularity distribution, and manipulability throughout the workspace. Workspace analysis guides robot selection for applications, optimal base placement, and task feasibility assessment. It is performed through forward kinematics sampling or analytical methods.

KinematicsHardware

World Model

A learned model of environment dynamics that predicts future observations or states given current state and action. World models enable model-based RL (plan by imagining future trajectories), data augmentation (generate synthetic training data), and safety verification (predict consequences of actions before executing). Video prediction models and latent dynamics models are the two main architectures.

Robot LearningRLSimulation

Weight Initialization

The strategy for setting initial values of neural network parameters before training. Proper initialization (Xavier/Glorot, Kaiming/He) ensures that activations and gradients maintain appropriate magnitude across layers at the start of training. Poor initialization can cause vanishing or exploding gradients. Pre-trained initialization (from foundation models) is a form of transfer learning.

MLTraining

Warehouse Robot

An autonomous mobile robot operating in warehouse and distribution center environments for order fulfillment, inventory management, and goods movement. Categories include goods-to-person robots (Kiva/Amazon Robotics), autonomous mobile robots (Locus, 6RS), and autonomous forklifts (Seegrid). Warehouse robots have driven the fastest commercial adoption of mobile robotics.

ApplicationsMobile RoboticsIndustrial

Welding Robot

An industrial robot arm equipped with a welding torch for automated arc welding (MIG, TIG, spot). Welding robots provide consistent weld quality, higher productivity, and remove humans from hazardous fume and arc flash exposure. They are the second-largest application of industrial robots (after material handling). Seam tracking sensors adapt to part variation.

ApplicationsIndustrial

Wavelet Transform

A multi-resolution signal analysis technique that decomposes images or time series into components at different frequency and spatial scales. Wavelets are used in image compression, texture analysis, and vibration signal processing for robot fault detection. Discrete Wavelet Transform (DWT) provides efficient multi-scale decomposition.

VisionMath

Walking

Legged locomotion with at least one leg always in contact with the ground (no flight phase). Walking is statically stable at low speed and dynamically controlled at higher speed. Human-like walking uses an inverted pendulum model for CoM motion. Bipedal walking is the primary locomotion mode for humanoid robots intended to operate in human environments.

Locomotion

Wrist Configuration

The joint configuration of the robot's wrist joints (typically 3 revolute joints providing full wrist orientation DOF). Wrist configuration selection is part of inverse kinematics and affects manipulability, collision avoidance, and singularity proximity. For redundant wrists, multiple configurations reach the same end-effector orientation, enabling optimization.

ManipulationKinematics

Worm Gear

A gear mechanism where a helical worm (screw) meshes with a worm wheel, achieving high reduction ratios with perpendicular shaft axes. Worm gears are inherently self-locking (the worm can drive the wheel but not vice versa), eliminating the need for brakes in gravity-loaded axes. They are used in robot arm base joints and pan-tilt mechanisms.

HardwareActuation

WebSocket

A bidirectional, full-duplex communication protocol over a single TCP connection, used for real-time web-to-robot communication. WebSockets enable browser-based robot control dashboards, live video streaming, and teleoperation interfaces without polling. ROSBridge wraps ROS topics in a WebSocket JSON protocol for web client access.

Software

Wireless Robot Control

Controlling a robot via Wi-Fi, Bluetooth, Zigbee, or proprietary radio links instead of tethered cables. Wireless control enables greater mobility and simplifies deployment but introduces latency and reliability challenges. Safety-critical functions (e-stop, safety monitoring) require redundant wired channels or fail-safe wireless protocols with guaranteed delivery.

HardwareSoftwareSafety

Weed Control Robot

An agricultural robot that autonomously detects and removes weeds using mechanical (cutting, pulling) or targeted chemical (micro-dose spraying) methods. Weed robots reduce herbicide use by 90%+ compared to blanket spraying. Computer vision models trained on crop/weed image datasets enable plant-level discrimination at field scale.

ApplicationsAgriculturalVision

Wound Care Robot

A medical robot that assists with wound inspection, cleaning, and dressing. Wound care robots use imaging to assess wound severity (size, depth, tissue type, infection signs) and may autonomously irrigate or apply dressing materials. They are particularly valuable in remote care settings and for reducing infection risk from frequent dressing changes.

ApplicationsMedical

Z

Zarr (data format)

Zarr is an open-source format for storing n-dimensional arrays in chunked, compressed form, designed for cloud-native and parallel I/O workloads. In robotics, Zarr is used to store large robot demonstration datasets (images, joint states, actions) in a format that can be read efficiently from object storage (S3, GCS) without downloading entire files. Unlike HDF5, Zarr supports concurrent writes, making it suitable for distributed data collection pipelines. Zarr v3 standardized the format and added support for sharding (combining many small chunks into fewer large files), which improves cloud storage efficiency. Projects like LeRobot and several autonomous vehicle datasets have adopted Zarr for large-scale dataset hosting.

DataStorageEngineering

Zero-shot Generalization

Zero-shot generalization is the ability of a trained policy to successfully perform on tasks, objects, or environments it has never explicitly seen during training, without any additional fine-tuning or demonstrations. True zero-shot transfer is a major goal of robot foundation models — a policy that generalizes zero-shot to novel household objects or new language instructions would dramatically reduce the data collection burden. Current VLA models show promising zero-shot language generalization (understanding novel phrasings of known task types) but still struggle with truly novel object categories or completely new manipulation skills. Improving zero-shot performance is the central motivation for scaling robot datasets and model sizes. See also Zero-shot Transfer article.

GeneralizationFoundation ModelResearch Frontier

Zarr Format

A chunked, compressed array storage format designed for large N-dimensional arrays. Zarr stores data in a directory of compressed chunks, enabling parallel I/O, lazy loading, and cloud-native storage. In robot learning, Zarr is used by some frameworks (notably robomimic) as an alternative to HDF5 for storing large demonstration datasets with efficient random access.

DataSoftware

Zero-Shot Generalization

A policy's ability to succeed on a task or with an object that was never seen during training, without any adaptation or fine-tuning. Zero-shot generalization is the ultimate goal of general-purpose robot learning. VLA models pre-trained on internet-scale data and diverse robot datasets demonstrate increasing zero-shot capabilities, though reliability remains a challenge.

Robot LearningTransfer Learning

Zero Dynamics

The internal dynamics of a nonlinear system when the output is constrained to be identically zero. For a robot with position output, the zero dynamics correspond to internal motions not visible in the output. Stable zero dynamics (minimum-phase systems) are necessary for output feedback control stability. Underactuated robots often have nontrivial zero dynamics.

ControlKinematics

Zero Moment Point

The point on the ground about which the total ground reaction moment (from all foot contacts) has zero horizontal component. For stable walking, the ZMP must lie within the support polygon (the convex hull of contact points). ZMP-based controllers generate CoM trajectories that keep the ZMP within bounds. It is the classical stability criterion for bipedal locomotion.

LocomotionControl

No terms match your search

Try a shorter term or check the spelling. All 65 terms are listed above when the search is cleared.

Need Robot Data for Your Learning Project?

We collect high-quality, learning-ready demonstrations for imitation learning and RL — from tabletop manipulation to mobile bimanual tasks.

Data Services Contact Us

Y

YOLO

You Only Look Once — a family of single-stage object detectors that predict bounding boxes and class probabilities directly from full images in one forward pass. YOLO models (v5, v7, v8, v9) prioritize inference speed, making them suitable for real-time robotic perception. YOLOv8 achieves 640×640 detection at 100+ FPS on an NVIDIA Jetson, enabling fast grasp candidate detection.

MLVision

6

6-DOF Pose Estimation

Estimating the full 6-DOF pose (3D position + 3D orientation) of an object from images or point clouds. Methods include: template matching (PoseCNN), keypoint regression (DOPE, PVNet), render-and-compare (DeepIM), and foundation model approaches (FoundPose, MegaPose). Accurate 6-DOF pose estimation enables precise pick-and-place and assembly task planning.

VisionManipulation