Utility functions


source

load_yaml

 load_yaml (config_path)

source

RLHFConfig

 RLHFConfig (epsilon:float=0.1, ent_coef:float=0.01, vf_coef:float=0.1)

Reference Model


source

create_reference_model

 create_reference_model (model)

Configs


source

ModelConfig

 ModelConfig (model_path:str)

source

TokenizerConfig

 TokenizerConfig (tokenizer_path:str)

source

OptimizerConfig

 OptimizerConfig (lr:float=0.0001, eps:float=1e-08,
                  weight_decay:float=1e-06)

source

TrainerConfig

 TrainerConfig (epochs:int=20)

source

PPOConfig

 PPOConfig (ent_coef:float=0.01, vf_coef:float=0.5)

source

InstructConfig

 InstructConfig (model:__main__.ModelConfig,
                 tokenizer:__main__.TokenizerConfig)