Vision-based Agile Flight Training Code

Overview

This repository contains the training code from Learning Vision-based Agile Flight via Differentiable Physics. Accepted by Nature Machine Intelligence'25. A very similar paper about this project written by the exact same authors can be found here.

This is a fork of project webpage. Master branch contains the original code and updated branch contains any changes.

Quick Demos

Single Agent Flights

Swarm Tasks

Environment Setup

Python Environment

The code has been used with the following environment:

PyTorch: 2.8.0 for Linux and CUDA 12.8
Python: 3.11.13
CUDA: 12.8 (with Nvidia RTX 5080 GPU)

The code should be compatible with other PyTorch and CUDA versions.

I recommend installing miniconda using the Linux terminal and creating a conda environment using this. You can follow these steps:

# Create and activate conda environment
conda create --name drone python=3.11
conda activate drone

# Install dependencies
pip install torch==2.8.0 torchvision==0.23.0
# OpenCV cannot be upgraded beyond this version because NumPy cannot be upgraded to 2.0 or greater (there is a dependency conflict).
pip install opencv-python==4.10.0.82 numpy==1.26.4
pip install tqdm tensorboard

Build CUDA Ops

To build the CUDA operations, run the following command:

pip install -e src

Training

To start the training process, use the following command:

# For multi-agent
python main_cuda.py $(cat configs/multi_agent.args)
# For single-agent
python main_cuda.py $(cat configs/single_agent.args)

multi_agent.args is unchanged from the original repo. single_agent.args includes all of the arguments from the original repo as well as a few extra ones, namely --use_depth_ratio to change the input from absolute depth to log(depth at t / depth at t-1) and --ckpt_dir to specify where to store model checkpoints (which occur by default every 10k iterations).

The training script will create a tensorboard folder under the folder runs/. I recommend moving the folder containing the model files into the respective tensorboard folder afterwards to keep them together.

Models

Not much work or experimentation has been done with trying different models. Currently there are two models available: the original model in model.py and a larger model based on a modified Xception architecture in xception_model.py. The training script and validation code import the original model from model.py by default; to change this you can edit the code.

Evaluation

You need to download the simulation validation code from here.

To evaluate the trained model in multi-agent settings, use the following command to launch the simulator (note this is only for multi-agent swarms):

cd <path to multi agent code supplementary>
./LinuxNoEditor/Blocks.sh -ResX=896 -ResY=504 -windowed -WinX=512 -WinY=304 -settings=$PWD/settings.json

Then, run the following command to evaluate the trained model:

python eval.py --resume <path to checkpoint> --target_speed 2.5

For evaluation/testing of the single-agent settings and models, see the validation_code/high_speed_flight/ readme after downloading and unzipping it from here. Put the folder in this directory so that the relative path to it is DiffPhysDrone/validation_code/

Citation

If using this repository, please cite our work

@article{zhang2025learning,
  title={Learning vision-based agile flight via differentiable physics},
  author={Zhang, Yuang and Hu, Yu and Song, Yunlong and Zou, Danping and Lin, Weiyao},
  journal={Nature Machine Intelligence},
  pages={1--13},
  year={2025},
  publisher={Nature Publishing Group}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
configs		configs
gifs		gifs
src		src
.gitignore		.gitignore
README.md		README.md
env_cuda.py		env_cuda.py
main_cuda.py		main_cuda.py
model.py		model.py
xception_model.py		xception_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Vision-based Agile Flight Training Code

Overview

Quick Demos

Single Agent Flights

Swarm Tasks

Environment Setup

Python Environment

Build CUDA Ops

Training

Models

Evaluation

Citation

About

Uh oh!

Releases

Packages

Languages

az40/DiffPhysDrone

Folders and files

Latest commit

History

Repository files navigation

Vision-based Agile Flight Training Code

Overview

Quick Demos

Single Agent Flights

Swarm Tasks

Environment Setup

Python Environment

Build CUDA Ops

Training

Models

Evaluation

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages