Line Following Robot (巡线机器人)¶

Project Type: Navigation | Difficulty: ★★☆☆☆ to ★★★★☆ (approach-dependent) | Estimated Time: 1–3 weekends

1. Project Overview¶

A line following robot is one of the classic introductory challenges in mobile robotics. The goal is simple: build a robot that detects a line (typically a black stripe on a white floor, or vice-versa) and steers itself to stay on that line as it moves forward.

  ┌──────────────────────────────────────────────┐
  │                 PATH (top view)               │
  │                                               │
  │     ┌───┐                                     │
  │     │ R │──►                                  │
  │     └───┘    \                                │
  │               ╲  ─────────────────────        │
  │                ╱                               │
  │     ┌───┐   /                                  │
  │     │   │◄─┘                                   │
  │     └───┘                                      │
  │   Robot follows the black line                 │
  └──────────────────────────────────────────────┘

In this project you will explore three progressively more sophisticated approaches:

Approach	Method	Sensors
Traditional	PID control + thresholding	Camera (monocular)
Intermediate	Stanley / Pure Pursuit controller	Camera or LiDAR
Modern	Deep Reinforcement Learning (DQN/PPO)	Camera or simulated sensor array

Each approach builds on the previous one and introduces new concepts in control theory, path tracking, and learning-based methods.

2. Hardware & Software Requirements¶

Hardware¶

Component	Specification	Notes
Chassis	2-wheel differential drive	Any kit (e.g., Arduino car kit)
Motors	2× DC gear motors with encoders	Encoders optional but recommended
Microcontroller	Arduino Uno / Mega or Raspberry Pi	Pi needed for camera-based approaches
Camera	USB webcam or Pi Camera	Required for CV and RL approaches
(Alternative) IR array	5–8 IR reflectance sensors	Simpler, no vision needed
Motor driver	L298N / TB6612	H-bridge driver
Battery	7.4V LiPo or 8×AA	Power the motors and electronics
Line track	Black electrical tape on white floor	Min 40 cm curve radius

Software¶

Package	Version	Purpose
Python	≥ 3.8	Core language
OpenCV	≥ 4.5	Image processing & line detection
NumPy	≥ 1.20	Numerical computation
PySerial	≥ 3.5	Serial communication with MCU
PyTorch / TensorFlow	≥ 1.13 / ≥ 2.10	Deep RL (Approach 3)
Stable-Baselines3	≥ 1.7	PPO implementation
OpenAI Gym	≥ 0.21	RL environment interface

pip install opencv-python numpy pyserial stable-baselines3 gymnasium

3. Control Loop — High-Level Architecture¶

All three approaches share the same fundamental sense → compute → act loop:

  ┌────────────┐     ┌──────────────┐     ┌──────────────┐
  │  SENSOR     │────▶│  CONTROLLER  │────▶│  ACTUATOR    │
  │  (Camera/   │     │  (PID/Stanley│     │  (Motor PWM) │
  │   IR/LiDAR) │     │   /RL Agent) │     │              │
  └──────┬──────┘     └──────────────┘     └──────┬───────┘
         │                                        │
         │          ┌──────────────┐              │
         │          │  ENVIRONMENT │              │
         └──────────│  (Line track)│◀─────────────┘
                    └──────────────┘

The error signal is the horizontal offset (in pixels or meters) between the robot's current position and the detected line center.

4. Approach A — Traditional: PID Control with Thresholding¶

4.1 Concept¶

The simplest and most widely used method. We detect the line in a camera frame using image thresholding, compute the offset from the frame center, and feed this error into a PID controller that outputs a steering command.

4.2 PID Controller¶

The PID (Proportional–Integral–Derivative) control law is:

\[ u(t) = K_p \, e(t) \;+\; K_i \int_0^t e(\tau)\,d\tau \;+\; K_d \frac{de(t)}{dt} \]

where:

\(e(t)\) — lateral error (line center − frame center) in pixels
\(K_p\) — proportional gain (corrects present error)
\(K_i\) — integral gain (eliminates steady-state offset)
\(K_d\) — derivative gain (dampens oscillations)

4.3 Line Detection Pipeline¶

  Camera frame (BGR)
        │
        ▼
  Convert to Grayscale
        │
        ▼
  Gaussian Blur (5×5)
        │
        ▼
  Binary Threshold (THRESH_BINARY_INV)
        │
        ▼
  Crop bottom half (region of interest)
        │
        ▼
  Find contours / compute centroid
        │
        ▼
  Error = centroid_x − frame_center_x

4.4 Complete Python Code¶

"""
Approach A: PID Line Following with Camera + OpenCV
=====================================================
Requirements: opencv-python, numpy, pyserial (optional for real hardware)
"""

import cv2
import numpy as np
import time

# ─── Configuration ──────────────────────────────────────────────
CAMERA_INDEX = 0            # 0 for default webcam
FRAME_WIDTH = 640
FRAME_HEIGHT = 480
ROI_TOP_RATIO = 0.5         # Use bottom 50% of frame as ROI
THRESHOLD_VALUE = 80        # Binary threshold (tune for your lighting)
BASE_SPEED = 150            # Base motor PWM (0–255)

# PID Gains (tune these for your robot)
KP = 0.4
KI = 0.01
KD = 0.3


class PIDController:
    """Discrete PID controller."""

    def __init__(self, kp: float, ki: float, kd: float):
        self.kp = kp
        self.ki = ki
        self.kd = kd
        self.prev_error = 0.0
        self.integral = 0.0

    def update(self, error: float, dt: float) -> float:
        self.integral += error * dt
        derivative = (error - self.prev_error) / dt if dt > 0 else 0.0
        output = self.kp * error + self.ki * self.integral + self.kd * derivative
        self.prev_error = error
        return output

    def reset(self):
        self.prev_error = 0.0
        self.integral = 0.0


def detect_line(frame: np.ndarray, roi_top_ratio: float = 0.5,
                thresh_val: int = 80) -> tuple:
    """
    Detect line in camera frame and return (centroid_x, error, masked_frame).

    Parameters
    ----------
    frame : BGR image from camera
    roi_top_ratio : fraction of frame height to crop from top (keep bottom portion)
    thresh_val : binary threshold value

    Returns
    -------
    centroid_x : x-coordinate of line centroid (-1 if not found)
    error : horizontal offset from frame center (pixels)
    masked : thresholded ROI image for visualization
    """
    h, w = frame.shape[:2]

    # 1. Grayscale + blur
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    blurred = cv2.GaussianBlur(gray, (5, 5), 0)

    # 2. Region of Interest (bottom portion)
    roi_top = int(h * roi_top_ratio)
    roi = blurred[roi_top:h, :]

    # 3. Binary threshold (line appears dark)
    _, thresh = cv2.threshold(roi, thresh_val, 255, cv2.THRESH_BINARY_INV)

    # 4. Find contours
    contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

    centroid_x = -1
    error = 0.0

    if contours:
        # Pick largest contour (assumed to be the line)
        largest = max(contours, key=cv2.contourArea)
        M = cv2.moments(largest)
        if M["m00"] > 0:
            centroid_x = int(M["m10"] / M["m00"])
            error = centroid_x - w // 2  # positive = line is to the right

    return centroid_x, error, thresh


def send_motor_command(left_speed: int, right_speed: int):
    """
    Send motor speeds to hardware.
    Replace this with your actual serial/GPIO communication.
    """
    # Example for Arduino serial protocol:
    # ser.write(f"M,{left_speed},{right_speed}\n".encode())
    print(f"  Motors -> L:{left_speed:4d}  R:{right_speed:4d}")


def main():
    cap = cv2.VideoCapture(CAMERA_INDEX)
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, FRAME_WIDTH)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, FRAME_HEIGHT)

    if not cap.isOpened():
        print("[ERROR] Cannot open camera")
        return

    pid = PIDController(KP, KI, KD)
    prev_time = time.time()

    print("[INFO] PID Line Following started. Press 'q' to quit.")

    while True:
        ret, frame = cap.read()
        if not ret:
            break

        current_time = time.time()
        dt = current_time - prev_time
        prev_time = current_time

        # ── Sense ──
        centroid_x, error, thresh = detect_line(frame, ROI_TOP_RATIO, THRESHOLD_VALUE)

        # ── Compute ──
        steering = pid.update(error, dt)
        # Clamp steering
        max_steer = BASE_SPEED
        steering = max(-max_steer, min(max_steer, steering))

        # ── Act ──
        left_speed = int(BASE_SPEED + steering)
        right_speed = int(BASE_SPEED - steering)
        left_speed = max(0, min(255, left_speed))
        right_speed = max(0, min(255, right_speed))

        if centroid_x == -1:
            # Line lost — spin in place to search
            print("  [WARN] Line lost! Searching...")
            send_motor_command(-80, 80)
        else:
            send_motor_command(left_speed, right_speed)

        # ── Visualization ──
        roi_top = int(FRAME_HEIGHT * ROI_TOP_RATIO)
        cv2.rectangle(frame, (0, roi_top), (FRAME_WIDTH, FRAME_HEIGHT), (0, 255, 0), 2)
        if centroid_x >= 0:
            cv2.circle(frame, (centroid_x, roi_top + 30), 10, (0, 0, 255), -1)
        cv2.putText(frame, f"Error: {error:.0f}  Steer: {steering:.1f}",
                    (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 255, 0), 2)

        cv2.imshow("Line Following - PID", frame)
        cv2.imshow("Threshold", thresh)

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

    cap.release()
    cv2.destroyAllWindows()
    print("[INFO] Stopped.")


if __name__ == "__main__":
    main()

4.5 Tuning the PID Gains¶

Step	Action
1	Set \(K_i = 0\), \(K_d = 0\). Increase \(K_p\) until the robot oscillates noticeably.
2	Set \(K_p\) to ~60% of that value.
3	Increase \(K_d\) until oscillations damp out.
4	Add a small \(K_i\) to eliminate steady-state offset on long straights.
5	Test on curves; re-adjust if the robot cuts corners or overshoots.

5. Approach B — Intermediate: Stanley Controller¶

5.1 Concept¶

The Stanley controller (developed at Stanford for the DARPA Grand Challenge) is a geometric path-tracking controller that accounts for both lateral error \(e\) and heading error \(\theta_e\). It produces smoother trajectories than PID on curved paths.

5.2 Control Law¶

\[ \delta = \theta_e + \arctan\!\left(\frac{k \, e}{v + \epsilon}\right) \]

where:

\(\delta\) — steering angle command (radians)
\(\theta_e\) — heading error between robot heading and tangent to the path
\(e\) — lateral (cross-track) error, positive to the right
\(v\) — forward velocity of the robot
\(k\) — gain parameter (controls convergence speed)
\(\epsilon\) — small constant to prevent division by zero

  Path tangent direction
  ──────────────────────►
            │  θ_e (heading error)
            │ /
            │/
      ┌─────┐
      │  R  │───── e (lateral error) ──►
      └─────┘

5.3 Path Representation¶

For line following, we represent the detected line as a sequence of waypoints. At each control cycle:

Find the closest waypoint on the path to the robot's front axle.
Compute \(e\) (signed distance from the path) and \(\theta_e\) (angle difference).
Apply the Stanley formula.

5.4 Complete Python Code¶

"""
Approach B: Stanley Controller for Line Following
==================================================
Simulated differential-drive robot following a parametric path.
No hardware required — runs with matplotlib visualization.
"""

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import FancyArrowPatch
import matplotlib.animation as animation

# ─── Parameters ─────────────────────────────────────────────────
DT = 0.05               # Time step (s)
TOTAL_TIME = 30.0        # Simulation duration (s)
V = 1.0                  # Forward speed (m/s)
K_STANLEY = 2.5          # Stanley gain
WHEEL_BASE = 0.3         # Robot wheelbase (m, for steering conversion)
ROBOT_LENGTH = 0.4

# ─── Path Definition (a wavy line) ──────────────────────────────
def path_points(n=1000):
    """Generate a smooth sinusoidal path."""
    s = np.linspace(0, 20, n)
    x = s
    y = 1.5 * np.sin(0.3 * s)
    return x, y

PATH_X, PATH_Y = path_points()


def closest_point_index(px: float, py: float) -> int:
    """Find index of closest point on path to (px, py)."""
    dists = (PATH_X - px)**2 + (PATH_Y - py)**2
    return int(np.argmin(dists))


def path_heading(idx: int) -> float:
    """Tangent heading angle at path index."""
    dx = PATH_X[min(idx + 1, len(PATH_X)-1)] - PATH_X[max(idx - 1, 0)]
    dy = PATH_Y[min(idx + 1, len(PATH_Y)-1)] - PATH_Y[max(idx - 1, 0]
    return np.arctan2(dy, dx)


def normalize_angle(a: float) -> float:
    """Wrap angle to [-pi, pi]."""
    return (a + np.pi) % (2 * np.pi) - np.pi


def stanley_control(x: float, y: float, heading: float, v: float) -> float:
    """
    Compute steering angle using Stanley controller.

    Returns steering angle δ (radians).
    """
    idx = closest_point_index(x, y)
    path_h = path_heading(idx)

    # Lateral error (signed): cross product of (path_to_robot) × path_tangent
    dx = x - PATH_X[idx]
    dy = y - PATH_Y[idx]
    tangent = np.array([np.cos(path_h), np.sin(path_h)])
    normal = np.array([-tangent[1], tangent[0]])
    e = dx * normal[0] + dy * normal[1]

    # Heading error
    theta_e = normalize_angle(path_h - heading)

    # Stanley formula
    steer = theta_e + np.arctan2(K_STANLEY * e, v + 1e-6)
    return steer, e, theta_e, idx


def main():
    # Initial state
    x, y, heading = PATH_X[0] - 0.5, PATH_Y[0] + 0.5, 0.3
    trajectory_x, trajectory_y = [x], [y]
    errors = []

    steps = int(TOTAL_TIME / DT)
    for step in range(steps):
        steer, e, theta_e, idx = stanley_control(x, y, heading, V)

        # Differential drive: convert steering angle to wheel speeds
        # ω = v * tan(δ) / L
        omega = V * np.tan(steer) / WHEEL_BASE

        # Update state
        x += V * np.cos(heading) * DT
        y += V * np.sin(heading) * DT
        heading += omega * DT
        heading = normalize_angle(heading)

        trajectory_x.append(x)
        trajectory_y.append(y)
        errors.append(abs(e))

    # ── Plot Results ──
    fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 8))

    ax1.plot(PATH_X, PATH_Y, 'b--', label='Path', linewidth=2)
    ax1.plot(trajectory_x, trajectory_y, 'r-', label='Robot trajectory', linewidth=1.5)
    ax1.set_xlabel('X (m)')
    ax1.set_ylabel('Y (m)')
    ax1.set_title('Stanley Controller — Line Following')
    ax1.legend()
    ax1.set_aspect('equal')
    ax1.grid(True, alpha=0.3)

    ax2.plot(np.arange(len(errors)) * DT, errors, 'g-', linewidth=1)
    ax2.set_xlabel('Time (s)')
    ax2.set_ylabel('|Lateral Error| (m)')
    ax2.set_title('Cross-Track Error Over Time')
    ax2.grid(True, alpha=0.3)

    plt.tight_layout()
    plt.savefig("stanley_line_following.png", dpi=150)
    plt.show()

    print(f"Average error: {np.mean(errors):.4f} m")
    print(f"Max error:     {np.max(errors):.4f} m")


if __name__ == "__main__":
    main()

5.5 PID vs Stanley — Key Difference¶

  PID Controller                    Stanley Controller
  ──────────────                    ──────────────────
  Only uses lateral error           Uses lateral + heading error
  No geometric model                Accounts for path tangent
  Oscillates on sharp curves        Smooth cornering
  Simple to implement               Needs path representation

       ╭───╮                              ╭───╮
      /  e  \  (overshoot)              /  e  \
  ───╱───────╲───                 ─────╱────────╲───
     Robot path                     Robot tracks path closely

6. Approach C — Modern: Deep Reinforcement Learning¶

6.1 Concept¶

Instead of hand-coding a controller, we train a neural network policy using reinforcement learning. The agent observes sensor readings (e.g., an array of IR values or a downsampled camera image) and learns to output motor commands that maximize cumulative reward (staying on the line, moving forward).

We use Proximal Policy Optimization (PPO) from Stable-Baselines3.

6.2 Environment Setup¶

  ┌─────────────────────────────────────────────┐
  │              RL Environment                  │
  │                                              │
  │  State:  [s1, s2, ..., s8]  (sensor array)  │
  │                                              │
  │  Action: [left_speed, right_speed]           │
  │          continuous ∈ [-1, 1] each           │
  │                                              │
  │  Reward: +1.0  if on line                    │
  │          +0.1×forward_speed                  │
  │          -10.0 if off track                  │
  │          -0.01  per timestep (time penalty)  │
  └─────────────────────────────────────────────┘

6.3 Complete Code — Custom Gym Environment + PPO Training¶

"""
Approach C: Deep RL Line Following with PPO
============================================
Custom Gymnasium environment simulating a line-following task.
Trains a PPO agent to follow a line using an 8-sensor array.
"""

import numpy as np
import gymnasium as gym
from gymnasium import spaces
from stable_baselines3 import PPO
from stable_baselines3.common.env_checker import check_env


class LineFollowEnv(gym.Env):
    """
    Simulated line-following environment.

    - Track: sine-wave line on a 2D plane
    - Robot: differential drive, 8 front-facing sensors
    - State: 8 sensor readings ∈ [0, 1]  (1 = line detected)
    - Action: [left_speed, right_speed] ∈ [-1, 1]
    """

    metadata = {"render_modes": ["human"]}

    def __init__(self, render_mode=None):
        super().__init__()
        self.render_mode = render_mode

        # Observation: 8 sensor readings
        self.observation_space = spaces.Box(low=0.0, high=1.0, shape=(8,), dtype=np.float32)
        # Action: left and right speed
        self.action_space = spaces.Box(low=-1.0, high=1.0, shape=(2,), dtype=np.float32)

        # Simulation parameters
        self.dt = 0.1
        self.max_steps = 500
        self.track_width = 0.5   # Distance at which robot is "on the line"
        self.sensor_spread = 0.3  # Half-width of sensor array

        self.robot_x = 0.0
        self.robot_y = 0.0
        self.robot_theta = 0.0
        self.step_count = 0

    def _track_y(self, x: float) -> float:
        """Y-coordinate of the line at position x."""
        return 2.0 * np.sin(0.3 * x)

    def _track_tangent(self, x: float) -> float:
        """Heading of the line tangent at position x."""
        dy_dx = 2.0 * 0.3 * np.cos(0.3 * x)
        return np.arctan2(dy_dx, 1.0)

    def _get_sensor_readings(self) -> np.ndarray:
        """
        Simulate 8 sensors spread across the robot's front.
        Each sensor reports 1.0 if it is close to the line, 0.0 if far.
        Use a Gaussian falloff.
        """
        readings = np.zeros(8, dtype=np.float32)
        for i in range(8):
            # Sensor position in robot frame
            offset = self.sensor_spread * (2 * i / 7 - 1)  # -0.3 to +0.3
            # Transform to world frame
            sx = self.robot_x + offset * np.cos(self.robot_theta + np.pi / 2)
            sy = self.robot_y + offset * np.sin(self.robot_theta + np.pi / 2)
            # Distance to line
            line_y = self._track_y(sx)
            dist = abs(sy - line_y)
            # Gaussian response
            readings[i] = np.exp(-(dist ** 2) / (2 * (self.track_width / 3) ** 2))
        return np.clip(readings, 0.0, 1.0)

    def _cross_track_error(self) -> float:
        """Signed distance from robot to the line."""
        return self.robot_y - self._track_y(self.robot_x)

    def reset(self, seed=None, options=None):
        super().reset(seed=seed)
        self.robot_x = 0.0
        self.robot_y = self._track_y(0.0) + self.np_random.uniform(-0.3, 0.3)
        self.robot_theta = self._track_tangent(0.0) + self.np_random.uniform(-0.2, 0.2)
        self.step_count = 0
        obs = self._get_sensor_readings()
        return obs, {}

    def step(self, action):
        self.step_count += 1
        left_speed = float(action[0])
        right_speed = float(action[1])

        # Differential drive kinematics
        v = 0.5 * (left_speed + right_speed)   # Linear velocity
        omega = 1.5 * (right_speed - left_speed)  # Angular velocity

        self.robot_x += v * np.cos(self.robot_theta) * self.dt
        self.robot_y += v * np.sin(self.robot_theta) * self.dt
        self.robot_theta += omega * self.dt

        # ── Reward ──
        cte = abs(self._cross_track_error())
        on_line = cte < self.track_width

        reward = 0.0
        if on_line:
            reward += 1.0                          # Bonus for being on line
            reward += 0.3 * max(v, 0)              # Bonus for moving forward
        else:
            reward -= 2.0                          # Penalty for leaving line

        reward -= 0.01 * cte                       # Proportional penalty
        reward -= 0.01                             # Small time penalty

        # Termination
        terminated = cte > 2.0                     # Too far off track
        truncated = self.step_count >= self.max_steps

        obs = self._get_sensor_readings()
        info = {"cross_track_error": cte, "on_line": on_line}

        return obs, reward, terminated, truncated, info

    def render(self):
        pass  # Visualization can be added with matplotlib


def train_ppo(total_timesteps: int = 100_000, save_path: str = "ppo_line_follower"):
    """Train a PPO agent on the line-following environment."""
    env = LineFollowEnv()
    check_env(env)  # Validate environment

    model = PPO(
        policy="MlpPolicy",
        env=env,
        learning_rate=3e-4,
        n_steps=1024,
        batch_size=64,
        n_epochs=10,
        gamma=0.99,
        gae_lambda=0.95,
        clip_range=0.2,
        verbose=1,
        tensorboard_log="./tb_logs/",
    )

    print(f"[INFO] Training PPO for {total_timesteps} timesteps...")
    model.learn(total_timesteps=total_timesteps)
    model.save(save_path)
    print(f"[INFO] Model saved to {save_path}")
    return model


def evaluate(model_path: str = "ppo_line_follower", episodes: int = 5):
    """Evaluate a trained model."""
    env = LineFollowEnv()
    model = PPO.load(model_path)

    for ep in range(episodes):
        obs, _ = env.reset()
        total_reward = 0.0
        done = False
        steps = 0

        while not done:
            action, _ = model.predict(obs, deterministic=True)
            obs, reward, terminated, truncated, info = env.step(action)
            total_reward += reward
            done = terminated or truncated
            steps += 1

        print(f"  Episode {ep+1}: reward={total_reward:.1f}, "
              f"steps={steps}, final_CTE={info['cross_track_error']:.3f}")


if __name__ == "__main__":
    import sys
    if len(sys.argv) > 1 and sys.argv[1] == "eval":
        evaluate()
    else:
        train_ppo()

6.4 Training Tips¶

Tip	Detail
Curriculum learning	Start with a straight line, gradually increase curvature
Reward shaping	Don't make the penalty too harsh early on — agent may learn to stand still
Sensor noise	Add Gaussian noise to sensor readings for robustness
Observation history	Stack last 3–4 frames as input so the agent can infer velocity direction
Hyperparameters	Use the provided values as starting point; tune `learning_rate` and `clip_range`

7. Step-by-Step Implementation Guide¶

Phase 1 — Set Up the Track (All Approaches)¶

Lay black electrical tape on a white/reflective floor.
Create straights, gentle curves (≥ 40 cm radius), and intersections.
Ensure consistent lighting (avoid direct sunlight on the track).

Phase 2 — Approach A (PID)¶

Mount the camera pointing downward-forward (30–45° from vertical).
Calibrate the threshold value with cv2.threshold on a sample frame.
Run detect_line() standalone and verify centroid detection.
Connect the PID output to motor commands (via serial to Arduino).
Tune \(K_p\), \(K_d\), \(K_i\) in that order (see Section 4.5).

Phase 3 — Approach B (Stanley)¶

Collect waypoints along the line (drive manually, log GPS/odometry).
OR generate waypoints from the camera-detected line (convert pixel → world coordinates).
Implement stanley_control() and test in simulation first.
Deploy on the robot with the same sensor pipeline as Approach A.
Tune the gain \(k\) — start with \(k = 2.0\) and increase until oscillation appears.

Phase 4 — Approach C (Deep RL)¶

Run the custom environment in simulation: python line_follow_rl.py
Train the PPO agent (~100k timesteps, ~5 minutes on CPU).
Evaluate with python line_follow_rl.py eval.
(Optional) Deploy to real robot: replace simulated sensors with actual IR/camera readings, fine-tune with a few real-world episodes.

8. Comparison of Approaches¶

Criteria	PID + Thresholding	Stanley Controller	Deep RL (PPO)
Accuracy (straight)	★★★★☆	★★★★★	★★★★☆
Accuracy (curves)	★★★☆☆	★★★★★	★★★★☆
Robustness to noise	★★★☆☆	★★★★☆	★★★★★
Robustness to lighting	★★☆☆☆	★★★☆☆	★★★★☆
Setup complexity	Low	Medium	High
Tuning effort	Medium (3 gains)	Low (1 gain)	High (hyperparams)
Computation cost	Very low	Low	High (inference)
Requires training data	No	No	Yes (sim or real)
Handles sharp turns	Poor	Good	Good (after training)
Best for	Simple tracks	Smooth racing lines	Complex/variable tracks

9. Extensions and Variations¶

9.1 Track Variations¶

Colored lines — follow a red/green/blue line instead of black using HSV filtering.
Dashed lines — handle gaps in the line (remember last known position).
Multi-line / intersections — decide which direction to turn at forks.
QR codes / markers — place markers along the track for localization.

9.2 Sensor Improvements¶

LiDAR-based — use a 2D LiDAR to detect floor reflectivity differences.
Dual-camera stereo — estimate line distance in 3D.
Event cameras — ultra-low-latency line detection at high speed.

9.3 Algorithm Enhancements¶

Model Predictive Control (MPC) — optimize over a prediction horizon for anticipatory steering.
Fuzzy logic control — linguistic rules instead of PID gains.
Behavior cloning — learn from human driving demonstrations.
Sim-to-real transfer — train RL in a photo-realistic simulator (Isaac Sim, Gazebo) and deploy on hardware.

9.4 Competition Challenges¶

Speed challenge — maximize lap speed (requires predictive control).
Endurance — run for 1 hour without intervention.
Obstacle avoidance — add objects on the track; combine line following with obstacle detection.

10. References¶

Craig, J.J. (2005). Introduction to Robotics: Mechanics and Control. 3^rd ed., Pearson.
Thrun, S., Montemerlo, M., et al. (2006). "Stanley: The Robot that Won the DARPA Grand Challenge." Journal of Field Robotics, 23(9), 661–692.
Hoffmann, G.M., Tomlin, C.J., Montemerlo, M., & Thrun, S. (2007). "Autonomous Automobile Trajectory Tracking for Off-Road Driving." IEEE Control Systems Magazine.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). "Proximal Policy Optimization Algorithms." arXiv:1707.06347.
Raffin, A., Hill, A., Gleave, A., et al. (2021). "Stable-Baselines3: Reliable Reinforcement Learning Implementations." JMLR, 22(268), 1–8.
OpenCV Documentation — docs.opencv.org
Gymnasium Documentation — gymnasium.farama.org