欢迎使用Congo! :tada:/博客/Cutedgeing NLP techniques/Cutedgeing NLP techniques1 分钟· 目录Direct Preference OptimisationDecision TransformersFine-tuning via reinforcement learning with human feedbackActiveDirect Preference Optimisation #Decision Transformers #Fine-tuning via reinforcement learning with human feedback #Active #