Author's official implementation of TPAMI paper "Generalizable Multi-modal Adversarial Imitation Learning for Non-stationary Dynamics"
-
Code of the discriminator is in the file
GMAIL/algps/gaifo.py, code of the generator is in the directoryGMAIL/algos/escp, and code of the non-stationary environment is in the directoryGMAIL/envs. -
Get demonstrations with the following command. The expert trajectories will be stored in the directory
expert_data/sac_HalfCheetah_gravityif you set the environment name asHalfCheetahand the changing parameter asgravity.python generate_expert_data.py
-
Run GMAIL in HalfCheetah with
gravityas the changing parameter with the following command. Demonstrations are in the directoryexpert_data/sac_HalfCheetah_gravity, so please collect demonstrations before run the code.python run_gmail.py --env_name HalfCheetah-v3 --varying_params gravity --expert-path-dir expert_data/sac_HalfCheetah_gravity --H_step 4 --use_rmdm --stop_pg_for_ep --bottle_neck --rbf_radius 3000 --name_suffix GMAIL --rnn_fix_length 16 --autoalpha
-
Modify the parameters
env_nameandvarying_paramsto run GMAIL in other tasks and with other varying params. For example, run GMAIL in Hopper withbody massas the changing parameter with the following command. Here, demonstrations are in the directoryexpert_data/sac_Hopper_mass.python run_gmail.py --env_name Hopper-v3 --varying_params body_mass --expert-path-dir expert_data/sac_Hopper_mass --use_absorbing_state --H_step 4 --use_rmdm --stop_pg_for_ep --bottle_neck --rbf_radius 3000 --name_suffix GMAIL --rnn_fix_length 2