Trainer

RLHF Trainer

LtCLIP+VF+S(θ)=E^t[LtCLIP(θ)c1LtVF(θ)+c2S[πθ](st)]

LCLIP(θ)=E^t[min(rt(θ)A^t,clip(rt(θ),1ϵ,1+ϵ)A^t)]

πθ(atst)πθold (atst)=log(πθ(atst))log(πθold (atst))

rt(θ)=πθ(atst)πθold (atst)


source

RLHFTrainer

 RLHFTrainer (model:transformers.modeling_utils.PreTrainedModel,
              ref_model:transformers.modeling_utils.PreTrainedModel,
              config:instruct_goose.utils.RLHFConfig)

Initialize self. See help(type(self)) for accurate signature.

Type Details
model PreTrainedModel A pre-trained language model
ref_model PreTrainedModel A a reference model
config RLHFConfig

source

RLHFTrainer.compute_loss

 RLHFTrainer.compute_loss (query_ids:typing.Annotated[torch.Tensor,{'__tor
                           chtyping__':True,'details':('batch_size','seq_l
                           en',),'cls_name':'TensorType'}], query_attentio
                           n_mask:typing.Annotated[torch.Tensor,{'__torcht
                           yping__':True,'details':('batch_size','seq_len'
                           ,),'cls_name':'TensorType'}], response_ids:typi
                           ng.Annotated[torch.Tensor,{'__torchtyping__':Tr
                           ue,'details':('batch_size','seq_len',),'cls_nam
                           e':'TensorType'}], response_attention_mask:typi
                           ng.Annotated[torch.Tensor,{'__torchtyping__':Tr
                           ue,'details':('batch_size','seq_len',),'cls_nam
                           e':'TensorType'}], rewards:typing.Annotated[tor
                           ch.Tensor,{'__torchtyping__':True,'details':('b
                           atch_size',),'cls_name':'TensorType'}])

Calculate PPO’s loss.