Hands-on Projects¶

This course culminates in 10 hands-on projects that cover the major robot task categories — from navigation and perception to manipulation, language understanding, and multi-agent coordination. Each project is designed to give you end-to-end experience with a real robotic task, progressing through three algorithm tiers:

Tier	Era	Philosophy	Examples
Traditional	Pre-deep-learning	Model-based, hand-crafted features	PID, Bug, PDDL, IK
Classical Learning	Deep learning era	Data-driven, end-to-end neural networks	DRL, DeepSORT, GraspNet, Seq2Seq
Modern / Foundation	Foundation model era	Pre-trained, few-shot, generalist	LLM, Transformer, Foundation Model

By implementing all three tiers for each task, you will develop an intuition for when traditional methods suffice, when learning-based methods shine, and where foundation models are heading.

Project Overview¶

#	Project	Task Category	Algorithm Spectrum	Difficulty
1	Line Following Robot	Navigation	PID → Stanley → RL	⭐⭐
2	Autonomous Obstacle Avoidance	Navigation	Bug/VFH → DWA → DRL	⭐⭐⭐
3	SLAM & Autonomous Navigation	SLAM	gmapping → Cartographer → ORB-SLAM3	⭐⭐⭐⭐
4	Visual Object Tracking	Perception	KCF → SORT/DeepSORT → Transformer	⭐⭐⭐
5	Robotic Arm Grasping	Manipulation	IK → GraspNet → RL	⭐⭐⭐
6	Voice Interaction Robot	Language	ASR+keywords → NLU pipeline → LLM	⭐⭐⭐
7	Multi-Robot Formation	Multi-Agent	Leader-Follower → Consensus → MARL	⭐⭐⭐⭐
8	Vision-Language Navigation	VLN	Modular → Seq2Seq → Foundation Model	⭐⭐⭐⭐
9	Object Assembly with TAMP	TAMP	PDDL → TAMP → LLM-based	⭐⭐⭐⭐⭐
10	Mobile Manipulation	Mobile+Manipulation	Decoupled → Joint → Foundation Model	⭐⭐⭐⭐⭐

Recommended Learning Order¶

The projects are arranged in a dependency graph. Start from the bottom (foundations) and work upward.

                    ┌─────────────────────────────────┐
                    │  10. Mobile Manipulation (⭐⭐⭐⭐⭐) │
                    └──────────┬──────────┬────────────┘
                               │          │
              ┌────────────────┘          └────────────────┐
              ▼                                            ▼
┌──────────────────────┐                    ┌───────────────────────────┐
│ 5. Robotic Arm       │                    │ 9. Assembly with TAMP     │
│    Grasping (⭐⭐⭐)    │                    │    (⭐⭐⭐⭐⭐)               │
└──────────┬───────────┘                    └─────────────┬─────────────┘
           │                                              │
           │         ┌───────────────────────────┐        │
           │         │ 8. Vision-Language        │        │
           │         │    Navigation (⭐⭐⭐⭐)    │        │
           │         └──────┬──────────┬────────┘        │
           │                │          │                 │
           │     ┌──────────┘          └──────────┐      │
           │     ▼                                ▼      │
           │ ┌──────────────────┐  ┌────────────────────┐│
           │ │ 4. Visual Object │  │ 6. Voice Interaction││
           │ │    Tracking (⭐⭐⭐)│  │    Robot (⭐⭐⭐)    ││
           │ └────────┬─────────┘  └─────────┬──────────┘│
           │          │                      │           │
           └──────────┼──────────────────────┼───────────┘
                      │                      │
           ┌──────────┼──────────────────────┘
           ▼          ▼
┌─────────────────────────────┐   ┌──────────────────────────┐
│ 7. Multi-Robot Formation    │   │ 3. SLAM & Autonomous     │
│    (⭐⭐⭐⭐)                  │   │    Navigation (⭐⭐⭐⭐)    │
└──────────┬──────────────────┘   └─────────────┬────────────┘
           │                                    │
           │         ┌──────────────────┐       │
           └────────►│ 2. Obstacle      │◄──────┘
                     │    Avoidance (⭐⭐⭐)│
                     └────────┬─────────┘
                              │
                     ┌────────▼─────────┐
                     │ 1. Line Following │
                     │    Robot (⭐⭐)    │
                     └──────────────────┘

Reading the diagram: arrows point from prerequisites to dependent projects. For example, Project 10 (Mobile Manipulation) requires skills from both Project 5 (Arm Grasping) and Project 9 (TAMP). You may work on projects at the same "level" in parallel.

Project Descriptions¶

1. Line Following Robot¶

Task: Build a robot that autonomously follows a line track on the ground using a camera or infrared sensor array.

Algorithm spectrum: Start with a classic PID controller to stay centered on the line. Advance to the Stanley controller for smoother path tracking with curvature awareness. Finally, train a reinforcement learning (RL) agent that learns the control policy directly from sensor observations, handling complex track geometries without explicit tuning.

Skills you will practice: Sensor reading, PWM motor control, feedback loops, basic RL training loops.

2. Autonomous Obstacle Avoidance¶

Task: Navigate a mobile robot through an environment cluttered with obstacles, reaching a goal without collisions.

Algorithm spectrum: Begin with the Bug algorithm and Vector Field Histogram (VFH) — reactive methods that make local decisions based on sensor readings. Progress to the Dynamic Window Approach (DWA), which plans velocity-space trajectories respecting kinematic constraints. Finish with Deep Reinforcement Learning (DRL), training an end-to-end policy in simulation (e.g., Gazebo or Isaac Sim) that transfers to the real robot.

Skills you will practice: Range sensor processing, local planning, velocity profiling, sim-to-real transfer.

Task: Build a map of an unknown indoor environment and use it for autonomous goal-directed navigation.

Algorithm spectrum: Start with gmapping (a particle-filter-based 2D SLAM system). Move to Cartographer for more robust graph-based 2D/3D SLAM with loop closure. Finally, explore ORB-SLAM3 — a feature-based visual SLAM system that works with monocular, stereo, and RGB-D cameras, enabling SLAM in GPS-denied environments.

Skills you will practice: LiDAR/camera calibration, occupancy grids, pose graph optimization, loop closure detection, map-based navigation (Nav2).

4. Visual Object Tracking¶

Task: Track a specific moving object across video frames in real time, maintaining identity even through occlusions.

Algorithm spectrum: Implement the Kernelized Correlation Filter (KCF) as a fast, hand-crafted tracker. Advance to SORT/DeepSORT, which combines detection (YOLO) with Kalman filtering and Hungarian assignment for multi-object tracking. Push further with Transformer-based trackers (e.g., TransTrack, OSTrack) that leverage attention mechanisms for robust, long-term tracking.

Skills you will practice: Image feature extraction, bounding box association, Kalman filtering, attention mechanisms, real-time inference optimization.

5. Robotic Arm Grasping¶

Task: Enable a robotic arm to grasp objects of various shapes from a tabletop.

Algorithm spectrum: Begin with analytical inverse kinematics (IK) for known object poses using geometric or numerical solvers. Graduate to GraspNet, a deep learning model that predicts 6-DoF grasp poses from point clouds. Conclude with RL-based grasping, where a policy learns through trial-and-error in simulation (Isaac Gym) to handle unknown objects, partial occlusions, and cluttered scenes.

Skills you will practice: Forward/inverse kinematics, URDF modeling, point cloud processing, reward shaping, sim-to-real for manipulation.

6. Voice Interaction Robot¶

Task: Build a robot that understands spoken commands and responds intelligently through natural conversation.

Algorithm spectrum: Start with a pipeline of automatic speech recognition (ASR) + keyword spotting for simple command detection. Build a full NLU pipeline with intent classification and slot filling for structured command understanding. Finally, integrate a large language model (LLM) that handles open-ended dialogue, contextual reasoning, and complex instruction following.

Skills you will practice: Audio processing, speech-to-text APIs, intent/slot models, prompt engineering, LLM API integration, latency management.

7. Multi-Robot Formation¶

Task: Coordinate a team of mobile robots to maintain a desired geometric formation while navigating.

Algorithm spectrum: Implement the Leader-Follower approach where one robot leads and others track relative positions. Upgrade to Consensus-based algorithms where all robots agree on shared states through distributed communication. Explore Multi-Agent Reinforcement Learning (MARL), where agents learn cooperative policies that adapt to dynamic environments and communication failures.

Skills you will practice: Distributed systems, ROS 2 multi-robot communication, graph theory for consensus, MARL training (e.g., MAPPO), communication-robust policies.

Task: Guide a robot through an environment using natural language instructions (e.g., "Go past the red sofa and into the kitchen").

Algorithm spectrum: Build a modular pipeline with separate modules for instruction parsing, visual feature extraction, and path planning. Train an Seq2Seq model that maps language and visual observations directly to navigation actions. Explore foundation model-based approaches (e.g., using CLIP, LLaVA, or GPT-4V) that perform zero-shot or few-shot VLN by grounding language in visual scenes.

Skills you will practice: Vision-language models, attention mechanisms, embodied AI simulators (Habitat, AI2-THOR), instruction grounding, evaluation metrics (SR, SPL).

9. Object Assembly with TAMP¶

Task: Plan and execute a multi-step object assembly task (e.g., building a simple structure from blocks).

Algorithm spectrum: Start with PDDL-based classical planning where task logic is hand-written in a planning domain description language. Move to integrated Task and Motion Planning (TAMP) that interleaves symbolic task planning with continuous motion planning (e.g., PDDLStream). Finally, explore LLM-based task planning, where a large language model generates action sequences from natural language task descriptions and scene observations.

Skills you will practice: Symbolic planning, PDDL encoding, motion planning (MoveIt), plan-and-execute loops, LLM grounding for robotics, constraint satisfaction.

10. Mobile Manipulation¶

Task: Combine mobile base locomotion with arm manipulation to perform tasks that require whole-body coordination (e.g., fetching objects from shelves).

Algorithm spectrum: Start with a decoupled approach where the base navigates and the arm manipulates independently with a handoff. Progress to joint planning that coordinates base and arm motions simultaneously for improved reachability and efficiency. Explore foundation model-based control (e.g., RT-2, Octo) that leverages large-scale pre-training for generalized mobile manipulation across diverse tasks and environments.

Skills you will practice: Whole-body kinematics, integrated motion planning, task sequencing, foundation model fine-tuning, real-world deployment challenges.

Skills Prerequisites¶

Before starting these projects, you should have:

Foundational Skills¶

Skill	Minimum Level	Recommended Resources
Python	Intermediate — classes, async, decorators	Preliminary
ROS 2	Basic — nodes, topics, services, launch files	ROS Tutorials
Linux CLI	Comfortable — bash, SSH, tmux	Preliminary
Git	Basic — branch, merge, PR workflow	—
Linear Algebra	Vectors, matrices, transforms	Robotics Math
Control Theory	PID, state-space basics	Planning

Recommended Background (per project group)¶

Project Group	Recommended Knowledge
Navigation (Projects 1–2)	Planning algorithms, Simulation
SLAM (Project 3)	ROS 2, Perception basics
Perception (Project 4)	Deep learning basics, OpenCV
Manipulation (Projects 5, 10)	Manipulation fundamentals, Simulation
Language (Project 6)	LLM basics, NLP fundamentals
Multi-Agent (Project 7)	Multi-agent RL, ROS 2 multi-robot
VLN (Project 8)	Perception, Language grounding
TAMP (Project 9)	Planning, Manipulation, Agents

Hardware & Software¶

Resource	Details
Simulation	Gazebo, Isaac Sim, MuJoCo, PyBullet, Habitat (see Simulation)
Robot Platforms	TurtleBot¾, Franka Emika, UR5, custom ROS 2 robots
Compute	GPU recommended for Projects 4–10 (RTX 3060 or better)
ROS Version	ROS 2 Humble / Iron (see ROS setup)

How to Use These Projects¶

Pick a project that matches your current skill level and interests.
Complete the Traditional tier first — it builds intuition and debugs the full pipeline.
Move to the Classical Learning tier — compare against the traditional baseline.
Experiment with the Modern tier — push the boundaries of what's possible.
Document your results — each project should produce a short report comparing all three tiers.

Project Portfolio

Completing all 10 projects gives you a portfolio spanning the full robotics stack — from low-level control to high-level reasoning with foundation models. This is exactly the breadth that top robotics labs and companies look for.