Chinmaya-Kausik / RLHF-comparison Star 0 Code Issues Pull requests Comparing various RLHF methods reinforcement-learning transformers transformer ppo dpo llm llms rlhf reinforcement-learning-from-human-feedback reinforcement-learning-from-ai-feedback Updated Sep 23, 2024 Jupyter Notebook
satyampurwar / large-language-models Star 0 Code Issues Pull requests Unlocking the Power of Generative AI: In-Context Learning, Instruction Fine-Tuning and Reinforcement Learning Fine-Tuning. memory-management bert conda-environment kl-divergence encoder-decoder-model proximal-policy-optimization encoder-model storage-management megacmd model-quantization large-language-models prompt-engineering generative-ai reinforcement-learning-from-human-feedback flan-t5 few-shot-prompting low-rank-adaptation reinforcement-learning-from-ai-feedback peft-fine-tuning-llm instruction-fine-tuning Updated Oct 25, 2024 Jupyter Notebook