Skip to content
View RobertKirk's full-sized avatar

Highlights

  • Pro

Block or report RobertKirk

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. facebookresearch/rlfh-gen-div facebookresearch/rlfh-gen-div Public archive

    This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity

    Python 50 7

  2. tinystories-wrappers tinystories-wrappers Public

    Code for the TinyStories experiments from "Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks".

    Jupyter Notebook 9 1

  3. facebookresearch/minihack facebookresearch/minihack Public archive

    MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

    Python 519 68

  4. stanford_alpaca stanford_alpaca Public

    Forked from tatsu-lab/stanford_alpaca

    Code and documentation to train Stanford's Alpaca models, and generate the data.

    Python