This repository contains the code for the paper Embedding Safety into RL: A New Take on Trust Region Methods.
It implements the algorithms C-NPG and C-TRPO and contains code to reproduce the benchmarking experiments from the paper.
For the computational experiments:
conda create -n ctrpo python=3.9pip install -r requirements.txtpython algorithms/c-trpo.py --task SafetyAntVelocity-v1To draw plots, run the respective notebook, e.g. plots/plots_benchmark.ipynb for the benchmark plots.
By default, the data loading functions in plots/helpers.py will load data from data/runs, which we provide in case you don't have the resources or the time to re-run the whole benchmark.
Finally, for the C-NPG example you'll need julia.
Once you have the julia binary, run the following in the terminal:
julia c-npg.jlThe code in this repo is adapted from SafePO.