Robot Manipulation¶
Manipulation is the ability to interact with and modify the physical world — grasping objects, opening doors, assembling parts, and using tools. It is arguably the most challenging domain in robotics due to the complexity of contact physics, the high dimensionality of the action space, and the need for precise control.
1. Pick-and-Place¶
Task Definition¶
Grasp an object from one location and place it at another. The simplest form of manipulation, but still an open research problem for arbitrary objects.
Formal Specification¶
Input: Object pose (or detection from camera), target pose
Output: Grasp pose → approach trajectory → grasp → lift → place
Metric: Success Rate (% of successful placements), completion time
Why It's Hard¶
Challenges in pick-and-place:
├── Object diversity — Varying shapes, sizes, weights, textures
├── Grasp planning — Where to grasp? What orientation?
├── Slip detection — Is the object securely held?
├── Occlusion — Camera may not see the object clearly
├── Collision avoidance — Avoid hitting other objects
└── Precision — Must place within tolerance (cm-level)
Key Benchmarks¶
| Benchmark | Year | Tasks | Robot | Key Feature |
|---|---|---|---|---|
| RLBench | 2020 | 100 | Franka Panda | Diverse manipulation tasks |
| Meta-World | 2019 | 50 | Sawyer arm | Multi-task RL benchmark |
| LIBERO | 2024 | 130 | Franka Panda | 4 difficulty suites |
| RoboSuite | 2021 | 8+ | Multiple arms | Modular framework |
| Calvin | 2022 | 34 | Franka Panda | Long-horizon, language-conditioned |
Typical Pipeline¶
RGB-D Camera → Object Detection (YOLO/SAM) → 6-DoF Pose Estimation
→ Grasp Planning (GraspNet / analytic) → Motion Planning (RRT/PRM)
→ Execution (impedance control) → Verification (did it succeed?)
Where It Applies¶
- Warehouse logistics: Pack items into boxes
- Manufacturing: Place components on assembly lines
- Household: Clear the table, load the dishwasher
Pick-and-Place Demo¶

2. Assembly¶
Task Definition¶
Fit multiple parts together to form a larger structure. Requires precise positioning, insertion, and often force-controlled interactions.
Types of Assembly¶
| Type | Example | Difficulty |
|---|---|---|
| Peg-in-hole | Insert a peg into a hole | Classic benchmark |
| Gear assembly | Mesh gears together | Tight tolerances |
| Furniture assembly | Build IKEA furniture | Long-horizon, language |
| Electronics assembly | Place PCB components | Micro-precision |
Key Benchmarks¶
- Peg-in-hole: Classical robotics benchmark, tolerance < 0.1mm
- Furniture Assembly: Nocker et al. (2023), requires language understanding + precise manipulation
- RoboSet Assembly: Real-world assembly tasks with demonstrations
Force Control in Assembly¶
# Impedance control for assembly tasks
class ImpedanceController:
"""
Impedance control: makes the robot behave like a spring-damper system.
Useful for assembly where contact forces must be regulated.
"""
def __init__(self, stiffness, damping):
self.K = stiffness # Spring constant (N/m)
self.D = damping # Damping coefficient (N·s/m)
def compute_force(self, x_current, x_desired, v_current, v_desired):
"""
F = K * (x_desired - x_current) + D * (v_desired - v_current)
When the robot pushes against a wall:
- x_current can't move further → force increases
- The force is regulated by K and D
"""
position_error = x_desired - x_current
velocity_error = v_desired - v_current
force = self.K * position_error + self.D * velocity_error
return force
Where It Applies¶
- Manufacturing: Automated assembly lines
- Construction: Prefabricated building assembly
- Space: In-orbit satellite assembly
3. Dexterous Manipulation¶
Task Definition¶
Manipulate objects using multi-fingered hands with dexterity comparable to human hands. This includes in-hand reorientation, precision grasping, and fine motor skills.
Why Dexterous?¶
Comparison of grippers:
Parallel Gripper (2 fingers):
├── Simple, reliable
├── Limited grasp types (pinch only)
└── Cannot reorient objects in hand
Dexterous Hand (5+ fingers):
├── Can grasp any shape
├── Can reorient objects in-hand
├── Can use tools
└── Much harder to control (20+ DoF)
Key Benchmarks¶
| Benchmark | Year | Hand | Tasks | Key Feature |
|---|---|---|---|---|
| DexYCB | 2021 | Human hand | Grasping | 582K hand-object frames |
| Adroit | 2018 | Shadow Hand | Pen manipulation | RL benchmark |
| DexArt | 2023 | Allegro Hand | Articulated objects | Real-world dexterous |
| TACTILE | 2024 | Various | Contact-rich | Tactile sensing |
Datasets¶
- DexYCB (CVPR 2021): 582K RGB-D frames of human hand grasping 10 YCB objects, 10 subjects
- GRAB (Sigal et al., 2021): Full-body grasping with contact
- OakInk (Yang et al., 2022): 1,800+ objects, hand-object interaction
State-of-the-Art Methods¶
Dexterous manipulation approaches:
1. RL + Simulation (most popular)
Train in Isaac Gym / MuJoCo with domain randomization
Example: DAPG (Rajeswaran et al., 2018)
2. Teleoperation → Imitation
Human demonstrates with glove → robot learns from demos
Example: DexYCB, T-AIR
3. Tactile-Guided Control
Use tactile sensors (GelSight, DIGIT) for contact feedback
Example: TACTO, FingerVision
4. Foundation Models for Manipulation
Use pretrained vision models to guide manipulation
Example: RT-2, Octo
Where It Applies¶
- Prosthetics: Dexterous artificial hands
- Manufacturing: Handling small, irregular parts
- Service: Manipulating everyday objects (bottles, tools, food)
4. Deformable Object Manipulation¶
Task Definition¶
Manipulate objects that change shape during interaction — cloth, rope, food, cables, soft tissues. Unlike rigid objects, deformable objects have infinite-dimensional state spaces.
Types¶
| Object | Example Task | Difficulty |
|---|---|---|
| Cloth | Fold a shirt | High — draping, folding |
| Rope | Tie a knot | High — topological changes |
| Cable | Route a cable | Medium — routing, avoiding tangles |
| Food | Slice vegetables | Medium — cutting, portioning |
| Soft tissue | Surgical manipulation | Very high — precision, safety |
Key Challenges¶
Deformable manipulation challenges:
├── State representation — How to represent cloth/rope state?
├── Simulation — Physics of deformable bodies is expensive
├── Perception — Tracking deformable objects is hard
├── Planning — Infinite-dimensional configuration space
└── Sim-to-real — Deformable sim doesn't transfer well
Key Benchmarks¶
- SoftGym (Lin et al., 2020): Cloth/fluid manipulation in simulation
- ROBOSURF (Qi et al., 2022): Cable routing benchmark
- Gym-Fluid (various): Fluid pouring and manipulation
Where It Applies¶
- Laundry: Folding clothes
- Agriculture: Harvesting soft fruits
- Surgery: Robotic-assisted minimally invasive surgery
5. Tool Use¶
Task Definition¶
Use external tools to extend the robot's capabilities — hammers, screwdrivers, spatulas, scissors. Requires understanding tool affordances and how to grasp and wield them.
Examples¶
Tool use tasks:
├── Hammer — Drive a nail
├── Screwdriver — Turn a screw
├── Spatula — Flip a pancake
├── Scissors — Cut paper
├── Wrench — Tighten a bolt
└── Brush — Paint a surface
Key Datasets¶
- Something-Something V2 (Goyal et al., 2017): 220K videos of object interactions including tool use
- EPIC-KITCHENS (Damen et al., 2018): Egocentric kitchen activities with tool manipulation
- RoboSet (Bharadhwaj et al., 2023): 11 tasks including tool use (cut, wipe, sweep)
Where It Applies¶
- Manufacturing: Using power tools autonomously
- Kitchen: Cooking with utensils
- Maintenance: Using diagnostic equipment
6. Mobile Manipulation¶
Task Definition¶
Combine navigation and manipulation — the robot must move to a location AND manipulate objects there. This is the most realistic and challenging task category.
Why It Matters¶
Real-world tasks are almost always mobile manipulation:
"Make coffee" = Navigate to kitchen + pick up mug +
operate coffee machine + carry mug back
"Clean the room" = Navigate to mess + pick up items +
navigate to trash/shelf + place items
Key Benchmarks¶
| Benchmark | Year | Setting | Tasks | Key Feature |
|---|---|---|---|---|
| Habitat 2.0 | 2021 | Simulation | Rearrangement | Physics-based |
| TEACh | 2022 | AI2-THOR | Dialog-based tasks | Language + manipulation |
| BEHAVIOR-1K | 2023 | OmniGibson | 1000 activities | Full household tasks |
| RoboCasa | 2024 | Simulation | Kitchen tasks | Large-scale kitchen sim |
Where It Applies¶
- Home assistants: General-purpose household robots
- Elder care: Help with daily activities
- Logistics: Pick items from shelves and deliver them
Manipulation Task Comparison¶
| Task | Key Challenge | DoF Required | Simulation | Real-World Gap |
|---|---|---|---|---|
| Pick-and-Place | Grasp planning | 6–7 | Good | Small |
| Assembly | Precision, force control | 6–7 + force | Moderate | Medium |
| Dexterous | High-DoF control | 20+ | Moderate | Large |
| Deformable | State representation | 6+ | Poor | Very large |
| Tool Use | Affordance understanding | 6–7 | Good | Medium |
| Mobile Manip | Navigation + manipulation | 6 + base | Moderate | Large |
Object Datasets for Manipulation¶
| Dataset | Objects | Modality | Key Feature |
|---|---|---|---|
| YCB | 77 (5 categories) | 3D models + physical | Industry standard |
| DexYCB | 10 YCB objects | RGB-D + hand tracking | Dexterous grasping |
| Google Scanned Objects | 1,031 | 3D scans | Simulation assets |
| ObjectNet | 313 classes, 50K images | RGB | Robustness testing |
| ACID | 1,000+ | 3D models | Articulated objects |
| OmniObject3D | 6,000+ | 3D scans + textures | Largest 3D object set |
References¶
- Tobin et al. (2017). "Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World." IROS 2017
- Rajeswaran et al. (2018). "Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations." RSS 2018
- James et al. (2020). "RLBench: The Robot Learning Benchmark & Learning Environment." IEEE RA-L
- Mees et al. (2022). "CALVIN: A Benchmark for Language-Conditioned Policy Learning." ICRA 2022
- Bharadhwaj et al. (2023). "RoboSet: A Multi-Task Dataset for Robot Learning." ICRA 2023
- Zhao et al. (2024). "RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots." RSS 2024
- Brohan et al. (2024). "Open X-Embodiment: Robotic Learning Datasets and RT-X Models." ICRA 2024