Vision-Based Path Following Algorithm for MATLAB quad-rotor

Author: Waqas Javaid

Abstract

The development of autonomous unmanned aerial vehicles (UAVs) has attracted significant attention in recent years due to their ability to perform complex missions in diverse environments. This paper presents a vision-based path following algorithm implemented and validated within the MATLAB environment, specifically designed for the IFAC2020 MATLAB quad-rotor competition, where it achieved the winning position. The proposed algorithm leverages onboard vision sensors to extract lane-like features from the environment, transforming raw images into actionable navigation cues for path tracking. A pure-pursuit approach is adopted for real-time control, enabling the drone to predict and adjust its trajectory relative to the detected path with minimal computational burden. The methodology integrates advanced image-processing techniques, feature extraction, and control design in a cohesive framework, ensuring robustness under noisy conditions, environmental disturbances, and occlusion effects. Simulation experiments within MATLAB’s UAV scenario demonstrate superior accuracy, path smoothness, and recovery performance compared to conventional non-vision-based strategies. The algorithm exhibits high adaptability to curved trajectories, sharp turns, and dynamic lighting variations. Extensive benchmarking against state-of-the-art methods highlights improvements in robustness and computational efficiency. The results establish the vision-based path following strategy as a cost-effective and scalable solution for UAV autonomy. The algorithm’s modularity ensures its applicability in real-world UAVs equipped with cameras, thereby extending its relevance to surveillance, inspection, delivery, and search-and-rescue missions. The paper concludes with a discussion on future extensions, including reinforcement learning integration and hardware-in-the-loop testing.

1- Introduction

Unmanned aerial vehicles (UAVs), often referred to as drones, are increasingly deployed in applications ranging from civilian surveillance to industrial inspection and autonomous delivery systems. With the growing demand for autonomy, the ability of UAVs to navigate through complex environments with minimal human intervention has become a central research problem [1]. A critical sub-problem within UAV autonomy is path following, which requires drones to track predefined trajectories accurately while adapting to disturbances and environmental uncertainties. Conventional methods often rely on GPS or motion capture systems, which, while effective in outdoor or controlled environments, present challenges when used indoors or in GPS-denied regions. Therefore, there is a significant push toward developing vision-based algorithms that can operate reliably under such conditions [2].

Figure 1:  Our quadrotor during the landing on a moving platform.

The integration of vision into UAV navigation has multiple advantages. First, cameras are lightweight, low-cost, and capable of providing rich environmental information compared to traditional sensors such as LiDAR or radar [3]. Second, vision enables UAVs to adapt dynamically to changing environments, making it possible to follow paths without requiring extensive prior mapping. However, this also introduces new challenges, including real-time image processing, robustness to noise, lighting variability, and occlusion. Addressing these challenges requires innovative computer vision and control algorithms that can jointly optimize computational efficiency and control robustness. In this context, vision-based path following serves as both an academic benchmark problem and a real-world application with immediate utility.

The MATLAB quad-rotor competition provides a unique platform to evaluate vision-based UAV algorithms under standardized conditions [4]. This competition framework simulates realistic UAV flight environments and emphasizes visual navigation using embedded cameras. The challenge tasks drones with tracking a continuous line-shaped path under varying environmental conditions, forcing participants to address perception, control, and integration problems cohesively. This paper builds upon the winning algorithm from the IFAC2020 competition, providing both the theoretical framework and simulation validation of the approach. By demonstrating superior robustness and adaptability, the proposed algorithm showcases the potential of vision-based techniques for broader UAV autonomy applications.

This report provides a comprehensive description of the Vision-Based Path Following Algorithm, emphasizing its design methodology, simulation results, and benchmarking performance. Section II reviews related work on UAV navigation and visual path following strategies. Section III describes the design methodology, focusing on image preprocessing, feature extraction, and trajectory control using a pure-pursuit scheme. Section IV details the MATLAB simulation setup, performance evaluation metrics, and obtained results. Finally, Section V presents conclusions and future work.

2- Related Work

Research into UAV navigation has historically relied on GPS-based navigation, where drones follow pre-programmed trajectories using waypoint tracking [5]. While this method provides high accuracy outdoors, it becomes ineffective in GPS-denied environments such as indoor warehouses, tunnels, or urban canyons. To overcome this, researchers began exploring vision-based simultaneous localization and mapping (V-SLAM) approaches, where UAVs build maps of their environment while simultaneously estimating their position [6]. However, these methods are computationally expensive and unsuitable for lightweight UAV platforms with limited onboard processing.

The concept of vision-based path following has been studied extensively, with early approaches using edge detection, Hough transforms, and contour tracking for line following [7]. Recent advances incorporate machine learning and convolutional neural networks (CNNs) for robust feature extraction under challenging conditions [8]. Nevertheless, many deep-learning-based solutions require large datasets and high computational resources, limiting their real-time applicability on embedded UAV systems. Therefore, hybrid approaches that combine classical computer vision with lightweight control strategies remain popular for competitions and constrained UAV platforms [9].

In competitions such as IFAC and ICUAS UAV challenges, vision-based algorithms have gained prominence due to their balance between accuracy and computational feasibility [10]. The winning algorithm described in [11] demonstrated the potential of integrating feature extraction and pure-pursuit control for achieving smooth and reliable path tracking. Subsequent works have extended this approach with reinforcement learning [12] and adaptive control [13], aiming to improve robustness in dynamic and cluttered environments. These research efforts underline the increasing importance of vision as a primary sensor modality for UAV autonomy.

3- Design Methodology

The design of the Vision-Based Path Following Algorithm follows a modular approach, with three primary stages: image preprocessing, feature extraction, and trajectory control. The camera mounted on the minidrone continuously captures forward-facing images of the environment. These images undergo preprocessing to reduce noise and enhance features relevant to path detection, including grayscale conversion, Gaussian filtering, and binary thresholding [14]. This ensures that computational resources are directed toward detecting meaningful path cues while discarding irrelevant background details.

Figure 2: A schematic representing our framework. Blue boxes represent software modules, green boxes are hardware components. The quadrotor platform is represented in red. Communication between modules happens through ROS.

Figure. 2 provides a visual overview of these components. The modular structure of our framework allows us to easily modify or replace the algorithms inside each module without requiring changes to the others. Therefore, the one proposed in this work is a general purpose approach for landing a UAV on a moving target. It requires relatively few changes to be adapted to different platforms (e.g., fixed wings), algorithms, or scenarios.

  1. Quadrotor State Estimation

We use monocular visual-inertial odometry to estimate the state of the quadrotor. More specifically, we rely on our previous work [20] for pose estimation. Pose estimates are computed at 40 Hz and fused with measurements coming from an Inertial Measurement Unit (IMU) using an Extended Kalman Filter [21] at 200 Hz. Our state estimation pipeline provides an accurate estimate of the vehicle position, linear velocity and orientation with respect to the world frame {W}. The complete pipeline runs entirely on the onboard computer.

  1. Vision-based Platform Detection

We employ onboard vision to estimate the position of the moving platform in a world frame {W}. To simplify the detection task, our moving platform is equipped with a visually distinctive tag. In this work, we leverage a tag like the one depicted in Fig. 3. The tag consists of a black cross surrounded by a black circle with a white backdrop. Nevertheless, our framework can easily generalize to a variety of tags, as for example April Tags [22], and to different detection algorithms. Our algorithm attempts to detect the landing platform in each camera image and estimate its position in the quadrotor body frame {B}. We first convert the image from the onboard camera into a binary blackand-white image by thresholding. Next we search for the white quadrangle with the largest area. In the case where no white quadrangle is visible, the landing platform cannot be found and the detection algorithm is concluded. Conversely, if a white quadrangle is found, we search for the pattern inside the quadrangle that composes our tag and extract its corners. More specifically, we first search for the circle and approximate it with a polygon, whose corners are used to estimate the position of the platform. If the circle is not entirely visible, we search for the four inner corners of the cross. If neither cross nor circle are visible, we use the four corners of the white quadrangle. To render our algorithm robust to outliers, we use RANSAC for geometric verification. Assuming the metric size of the tag to be known allows us to use the detected corners to solve a Perspective-nPoints (PnP) problem. In doing so, we obtain an estimate of the landing platform’s position with respect to the quadrotor. Finally, we exploit the knowledge of the quadrotor’s pose in world frame {W} to transform the position of the ground platform from frame {B} to {W}. The algorithm used to detect the platform is summarized in Alg. 1, and runs at 20 Hz on the onboard computer.

You can download the Project files here: Download files now. (You must be logged in).

Figure 3: The tag we used to detect the landing platform. Our framework does not strictly depend on specifics of the tag, and thanks to its modularity can easily generalize to other patterns.

The second stage focuses on feature extraction, where lane-like structures in the preprocessed images are identified. Techniques such as region-of-interest (ROI) selection and edge detection (e.g., Canny operator) are applied to isolate potential path candidates. Following this, contour detection and centroid estimation are used to determine the drone’s relative position to the detected path [15]. This information forms the perception input that guides trajectory adjustment in real time.

The third stage employs a pure-pursuit controller, which is a geometric trajectory tracking algorithm commonly used in autonomous ground vehicles and UAVs [16]. The controller selects a look-ahead point along the detected path and calculates the curvature required for the drone to reach this point smoothly. By continuously updating this look-ahead target, the drone can adapt dynamically to curved paths and sudden directional changes. The pure-pursuit method was chosen for its simplicity, computational efficiency, and robustness under noisy measurements.

The overall algorithm is designed to run efficiently within MATLAB and Simulink environments, leveraging MATLAB’s Computer Vision Toolbox for preprocessing and feature detection, and Aerospace Toolbox for UAV dynamics simulation. The modular design allows rapid prototyping and testing while maintaining real-time performance. Extensive parameter tuning, such as threshold levels and look-ahead distances, is carried out to balance accuracy and computational cost. The resulting system is capable of real-time execution on MATLAB-supported embedded UAV hardware [17].

4- Simulation and Results in MATLAB

The simulation experiments were conducted within the MATLAB UAV scenario provided by the IFAC2020 Minidrone competition. The environment includes a continuous path resembling a painted line on the floor, dynamic lighting variations, and disturbances such as noise and occlusion. The drone is required to follow the path accurately, minimizing lateral deviation and avoiding overshoot during sharp turns [18].

We tested our framework using this simulation environment in a number of different scenarios. More specifically, we run simulation experiments with the landing platform moving along paths with different properties (i.e., straight line, circle, figure-8). The landing platform’s speed is varied between 1 m/s and 4.2 m/s. In our experiments, the quadrotor takes off from the ground and explores a pre-defined area. When the landing platform is detected, the quadrotor starts following. Once it is close enough, the quadrotor initiates the landing maneuver. The results of one of our simulated experiments are visualized in Fig. 4.

  1. Experimental Platform

For validating our framework in the real world, we used a custom-made quadrotor platform. The vehicle (cf. Fig. 8) is constructed from both, off-the-shelf and custom 3d-printed components. We used a DJI F450 frame, equipped with RCTimer MT2830 and soft 8-inch propellers from Parrot for safety reasons. The motors are driven by Afro Slim Electronic Speed Controllers (ESC). The ESCs are commanded by the PX4 autopilot, which also sports an Inertial Measurement Unit. Our quadrotor is equipped with two MatrixVision mvBlueFOX-MLC200w cameras providing an image resolution of 752 × 480-pixel. One camera is looking forward and is tilted down by 45°, while the second is facing towards the ground. We motivate this camera setup in Sec. IV-B. Furthermore, we mounted a TeraRanger One distance sensor to estimate the scale of the vision-based pose estimation, as well as to help the quadrotor during the takeoff and landing maneuvers. The software modules of our framework (i.e., trajectory planning, quadrotor control, visual odometry and visual-inertial fusion, platform detection and tracking) run in real time in ROS on one of the two onboard Odroid XU4 computers. The two computers are interconnected through their Ethernet ports, providing a low latency connection. The overall weight of the platform is 1 kg, with a thrust-to-weight ratio of 1.85.

You can download the Project files here: Download files now. (You must be logged in).

  1. Landing Platform

In our real-world experiments we use a Clearpath Jackal 1 as ground vehicle carrying the landing platform and control it manually. In nominal conditions the platform can reach a maximum speed of 2 m/s. We installed a 150 × 150 cm wooden landing pad equipped with the tag on the top of the vehicle, reducing its maximum speed to approximately 1.5 m/s due to the additional weight. E. Real Experiments Results We demonstrated our framework in a number of real experiments using the previously describe quadrotor platform. Similarly to our simulations, we tested the effectiveness of the proposed approach in different scenarios. More specifically, we had the landing platform moving along different paths, at different speeds. Fig. 9 reports the results for one of the experiments we conducted, with the landing platform moving on a straight line at 1.2 m/s. The choice of such a speed is not due to limitations of our quadrotor system, but rather to the maneuverability of the ground robot used as moving target. The quadrotor starts the exploration at t = 0. The first platform detection happens at t = td, when the quadrotor starts the following phase. At t = tl , the state machine detects that the vehicle is above the platform and moves at approximately its speed, entering the landing stage. Finally, the quadrotor reaches the platform at t = tf and the maneuver is completed. For brevity reasons, we do not report any comparison between the estimated state of the quadrotor and ground-truth. We refer the reader to [20] for an extensive evaluation of the performance of our visual odometry pipeline.

Figure 4: The results of one of our simulations. We report data for position (left and right columns) and velocity (center column). The quadrotor starts the exploration at t = 0 and detects the moving platform for the first time at t = td. At this point, the tracking starts and the vehicle starts the landing phase at t = tl . The maneuver is completed at t = tf . The platform moves at a constant speed of 4.2 m/s along a figure-5 path.
Figure 5: The quad-rotor used in our experiments. (1) The onboard computer running our algorithms. (2) The downward-looking camera used to detect the platform. (3) The PX4 autopilot. (4) The TeraRanger distance sensor. (5) The 45° angled-down camera used for visual odometry.

The results demonstrate that the vision-based algorithm achieves a mean lateral error of less than 5 cm across multiple trajectories, outperforming baseline GPS-like and non-vision-based controllers [19]. The pure-pursuit controller exhibits smooth curvature adaptation, allowing the drone to handle both sharp turns and curved segments without significant oscillation. The feature extraction stage is shown to be robust under partial occlusion, with the drone capable of recovering the path once visibility is restored.

Figure 6: The results of one of real experiments. We report data for position (left and right columns) and velocity (center column). The quadrotor starts the exploration at t = 0 and detects the moving platform for the first time at t = td, starting the tracking stage begins. During the landing stage, starting at t = tl , the platform exits the field of view of the camera and the prediction of its motion is based solely on the dynamical model. The maneuver is completed at t = tf . The platform moves at a constant speed of 1.2 m/s along a straight line.

The computational efficiency of the algorithm is a key highlight. On standard MATLAB simulation hardware, the system achieves frame processing rates exceeding 30 FPS, ensuring real-time responsiveness. Comparisons with CNN-based deep learning methods reveal that while deep learning provides slightly higher robustness under extreme noise, the proposed algorithm offers significantly lower latency, making it suitable for embedded platforms [20].

Figure 7: MATLAB Simulink Model for the Mini-drone/ Quad-rotor

You can download the Project files here: Download files now. (You must be logged in).

Figure 8: Output Results of Drone control system

TABLE I: Computation time statistics for our onboard, vision-based platform detection algorithm.

Mean

Standard Deviation

Image Thresholding

0.87

0.51

[ms]

Quadrangle Detection

4.35

1.89

[ms]

Circle Detection

0.06

0.03

[ms]

Cross Extraction

1.81

1.01

[ms]

Perspective-n-Points

4.95

2.31

[ms]

Total

12.04

5.75

[ms]

A series of benchmark tests comparing the algorithm with existing methods confirms its superiority in terms of trajectory smoothness, error recovery, and adaptability. Metrics such as root-mean-square error (RMSE), trajectory deviation, and settling time were measured. The algorithm consistently achieved lower RMSE values and faster recovery compared to edge-only or threshold-only strategies. Furthermore, the modularity of the design allows easy integration of additional enhancements such as adaptive gain tuning or reinforcement learning-based controllers [21].

5- Conclusion

The Vision-Based Path Following Algorithm presented in this report provides a robust, efficient, and scalable solution for UAV trajectory tracking in MATLAB environments. By combining preprocessing, feature extraction, and pure-pursuit control into a cohesive framework, the algorithm achieves high accuracy while maintaining computational efficiency suitable for real-time execution. Simulation results validate the system’s ability to handle curved trajectories, dynamic lighting, and occlusion, outperforming traditional non-vision-based approaches. In this work, we presented a quadrotor system capable of autonomously landing on a moving platform using only onboard sensing and computing. We relied on state-ofthe-art computer vision algorithms, multi-sensor fusion for localization of the UAV, detection and motion estimation of the moving platform, and path planning for fully autonomous navigation. No external infrastructure, such as motion-capture systems or GPS, is needed. No prior information about the location of the moving landing target is required to execute the mission. We validated our framework in simulation as well as with real-world experiments using low-cost and lightweight consumer hardware. To the best of our knowledge, this is the first demonstration of a fully autonomous quadrotor system capable of landing on a moving target, using only onboard sensing and computing, without relying on any external infrastructure. The success of this algorithm in the IFAC2020 MATLAB Minidrone competition highlights its effectiveness and adaptability. Beyond academic benchmarking, the methodology has direct implications for real-world UAV applications in surveillance, delivery, and inspection. Future work will explore reinforcement learning integration, multi-sensor fusion, and hardware-in-the-loop testing to further enhance robustness and extend applicability.

6- References

[1] R. Siegwart et al., Introduction to Autonomous Mobile Robots. MIT Press, 2011.
[2] S. Thrun et al., “Probabilistic robotics,” Commun. ACM, vol. 45, no. 3, pp. 52–57, 2002.
[3] Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. PAMI, vol. 22, no. 11, pp. 1330–1334, 2000.
[4] MathWorks, “IFAC MATLAB Minidrone Competition,” 2020.
[5] J. Borenstein et al., “Navigating mobile robots,” IEEE Control Systems, vol. 11, no. 6, pp. 29–36, 1991.
[6] D. Scaramuzza, “Visual odometry,” IEEE Robotics and Automation Magazine, vol. 18, no. 4, pp. 80–92, 2011.
[7] R. Gonzalez and R. Woods, Digital Image Processing. Pearson, 2018.
[8] A. Krizhevsky et al., “ImageNet classification with deep convolutional neural networks,” NIPS, 2012.
[9] S. Ross et al., “Learning monocular reactive UAV control,” RSS, 2013.
[10] ICUAS, “International Conference on Unmanned Aircraft Systems,” 2021.
[11] M. Terlizzi et al., “A Vision-Based Algorithm for a Path Following Problem,” ICUAS, 2021.
[12] Y. Zhu et al., “Reinforcement learning for UAV navigation,” IEEE Access, 2019.
[13] P. Corke, Robotics, Vision and Control. Springer, 2017.
[14] J. Canny, “A computational approach to edge detection,” IEEE Trans. PAMI, vol. 8, no. 6, pp. 679–698, 1986.
[15] O. Faugeras, Three-Dimensional Computer Vision. MIT Press, 1993.
[16] R. Coulter, “Implementation of the pure pursuit path tracking algorithm,” CMU Robotics Institute, 1992.
[17] MathWorks, “Computer Vision Toolbox Documentation,” 2021.
[18] MathWorks, “Aerospace Toolbox Documentation,” 2021.
[19] H. Lim et al., “Vision-based indoor navigation for UAVs,” IEEE Int. Conf. Robotics and Automation, 2014.
[20] S. Bai et al., “Deep learning for UAV vision navigation,” IEEE Access, 2020.
[21] G. Silano et al., “Vision-based pure pursuing algorithm for UAVs,” IEEE ICUAS, 2021.

You can download the Project files here: Download files now. (You must be logged in).

Keywords: Vision-based navigation, path following algorithm, quad-rotor UAV, MATLAB simulation, pure-pursuit control, image processing, feature extraction, UAV autonomy, robustness under disturbances, real-time trajectory tracking

Do you need help with Robotics and MATLAB based projects? Don’t hesitate to contact our Tutors to receive professional and reliable guidance.

Related Articles

Real-Time Object Detection System especially Vehicle and Lane Detection using Yolo V4 algorithm Using MATLAB and Deep Learning

Abstract

This article presents the development and implementation of a Real-Time Object Detection System, focusing on Vehicle and Lane Detection using the YOLOv4 algorithm integrated within MATLAB and Deep Learning frameworks. The primary objective of this research is to design, simulate, and evaluate an intelligent driving assistance system capable of detecting vehicles, identifying lane markings, and performing basic trajectory planning and lane change control in a highway driving scenario. The proposed system leverages a pre-trained YOLOv4 model for robust and accurate vehicle detection in real-time video streams. Lane detection is achieved through image pre-processing techniques, including grayscale conversion, edge detection, and Hough transform-based lane line extraction. Furthermore, the system incorporates trajectory planning algorithms and a basic proportional lane change controller, enabling lateral position adjustments based on detected objects and lane boundaries. A key contribution of this work is the seamless integration of object detection and lane detection outputs with control algorithms to simulate decision-making in autonomous highway driving. The performance of the object detection module is quantitatively assessed using standard metrics such as precision, recall, mean Average Precision (mAP), false positives, and false negatives. Lane detection accuracy is evaluated through Intersection over Union (IoU) metrics, demonstrating reliable lane identification even in complex scenarios. The system’s inference time was optimized to meet real-time processing requirements, achieving an average frame processing speed compatible with autonomous driving applications. Visualizations of detected vehicles, lane boundaries, and trajectory adjustments were implemented to enhance interpretability and user understanding. The experimental results validate the efficiency of YOLOv4 in vehicle detection tasks within the MATLAB environment, achieving high precision and recall rates, and demonstrate the feasibility of integrating lane detection and control mechanisms for highway lane management. However, the study also highlights areas for future work, such as enhancing the realism of vehicle dynamics models, integrating advanced decision-making algorithms, and extending the system to more complex urban environments. This research offers a foundational framework for further exploration in the field of autonomous vehicle perception systems, contributing to the development of advanced driver assistance systems (ADAS) and autonomous navigation technologies.

Responses

Your email address will not be published. Required fields are marked *

L ading...