This repository contains the training code from Learning Vision-based Agile Flight via Differentiable Physics. Accepted by Nature Machine Intelligence'25. A very similar paper about this project written by the exact same authors can be found here.
This is a fork of project webpage. Master branch contains the original code and updated branch contains any changes.
![]() |
![]() |
![]() |
![]() |
The code has been used with the following environment:
- PyTorch: 2.8.0 for Linux and CUDA 12.8
- Python: 3.11.13
- CUDA: 12.8 (with Nvidia RTX 5080 GPU)
The code should be compatible with other PyTorch and CUDA versions.
I recommend installing miniconda using the Linux terminal and creating a conda environment using this. You can follow these steps:
# Create and activate conda environment
conda create --name drone python=3.11
conda activate drone
# Install dependencies
pip install torch==2.8.0 torchvision==0.23.0
# OpenCV cannot be upgraded beyond this version because NumPy cannot be upgraded to 2.0 or greater (there is a dependency conflict).
pip install opencv-python==4.10.0.82 numpy==1.26.4
pip install tqdm tensorboardTo build the CUDA operations, run the following command:
pip install -e srcTo start the training process, use the following command:
# For multi-agent
python main_cuda.py $(cat configs/multi_agent.args)
# For single-agent
python main_cuda.py $(cat configs/single_agent.args)multi_agent.args is unchanged from the original repo. single_agent.args includes all of the arguments from the original repo as well as a few extra ones, namely --use_depth_ratio to change the input from absolute depth to log(depth at t / depth at t-1) and --ckpt_dir to specify where to store model checkpoints (which occur by default every 10k iterations).
The training script will create a tensorboard folder under the folder runs/. I recommend moving the folder containing the model files into the respective tensorboard folder afterwards to keep them together.
Not much work or experimentation has been done with trying different models. Currently there are two models available: the original model in model.py and a larger model based on a modified Xception architecture in xception_model.py. The training script and validation code import the original model from model.py by default; to change this you can edit the code.
You need to download the simulation validation code from here.
To evaluate the trained model in multi-agent settings, use the following command to launch the simulator (note this is only for multi-agent swarms):
cd <path to multi agent code supplementary>
./LinuxNoEditor/Blocks.sh -ResX=896 -ResY=504 -windowed -WinX=512 -WinY=304 -settings=$PWD/settings.jsonThen, run the following command to evaluate the trained model:
python eval.py --resume <path to checkpoint> --target_speed 2.5For evaluation/testing of the single-agent settings and models, see the validation_code/high_speed_flight/ readme after downloading and unzipping it from here. Put the folder in this directory so that the relative path to it is DiffPhysDrone/validation_code/
If using this repository, please cite our work
@article{zhang2025learning,
title={Learning vision-based agile flight via differentiable physics},
author={Zhang, Yuang and Hu, Yu and Song, Yunlong and Zou, Danping and Lin, Weiyao},
journal={Nature Machine Intelligence},
pages={1--13},
year={2025},
publisher={Nature Publishing Group}
}



