Vision Transformers for End-to-End Vision-Based Quadrotor Obstacle Avoidance

Bhattacharya, Anish; Rao, Nishanth; Parikh, Dhruv; Kunapuli, Pratik; Matni, Nikolai; Kumar, Vijay

Full-text links:

Download:

Current browse context:

cs.RO

< prev | next >

new | recent | 2405

Computer Science > Robotics

Title: Vision Transformers for End-to-End Vision-Based Quadrotor Obstacle Avoidance

Authors: Anish Bhattacharya, Nishanth Rao, Dhruv Parikh, Pratik Kunapuli, Nikolai Matni, Vijay Kumar

(Submitted on 16 May 2024)

Abstract: We demonstrate the capabilities of an attention-based end-to-end approach for high-speed quadrotor obstacle avoidance in dense, cluttered environments, with comparison to various state-of-the-art architectures. Quadrotor unmanned aerial vehicles (UAVs) have tremendous maneuverability when flown fast; however, as flight speed increases, traditional vision-based navigation via independent mapping, planning, and control modules breaks down due to increased sensor noise, compounding errors, and increased processing latency. Thus, learning-based, end-to-end planning and control networks have shown to be effective for online control of these fast robots through cluttered environments. We train and compare convolutional, U-Net, and recurrent architectures against vision transformer models for depth-based end-to-end control, in a photorealistic, high-physics-fidelity simulator as well as in hardware, and observe that the attention-based models are more effective as quadrotor speeds increase, while recurrent models with many layers provide smoother commands at lower speeds. To the best of our knowledge, this is the first work to utilize vision transformers for end-to-end vision-based quadrotor control.

Comments:	8 pages, 10 figures, 3 tables
Subjects:	Robotics (cs.RO); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
Cite as:	arXiv:2405.10391 [cs.RO]
	(or arXiv:2405.10391v1 [cs.RO] for this version)

Submission history

From: Anish Bhattacharya [view email]
[v1] Thu, 16 May 2024 18:36:43 GMT (11683kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2405.10391

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Robotics

Title: Vision Transformers for End-to-End Vision-Based Quadrotor Obstacle Avoidance

Submission history