Awesome Reasoning LLM Tutorial/Survey/Guide
-
Updated
Mar 29, 2025 - Python
Awesome Reasoning LLM Tutorial/Survey/Guide
Explore the Multimodal “Aha Moment” on 2B Model
A brief and partial summary of RLHF algorithms.
Lightweight replication study of DeepSeek-R1-Zero. Interesting findings include "No Aha Moment", "Longer CoT ≠ Accuracy", and "Language Mixing in Instruct Models".
[EMNLP 2022] Continual Training of Language Models for Few-Shot Learning
A novel alignment framework that leverages image retrieval to mitigate hallucinations in Vision Language Models.
Pure RL to post-train base models for social reasoning capabilities. Lightweight replication of DeepSeek-R1-Zero with Social IQa dataset.
RFTT: Reasoning with Reinforced Functional Token Tuning
Official implementation for "Diffusion Instruction Tuning"
Official repository for the paper "Monocular Event-Based Vision for Obstacle Avoidance with a Quadrotor" by Bhattacharya, et al. (2024) from GRASP, Penn & RPG, UZH.
Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"
An Approach to Enhancing the Efficacy of Post-Training Using Synthetic Data by Iterative Data Selection
Machine Reading Comprehension Competition w/ Korean BERT Model
Reproducible figures for "Post Training in Deep Learning"
We use RL to train a SOTA MLLM captioner.
Sudoku4LLM is a Sudoku dataset generator for training and evaluating reasoning in Large Language Models (LLMs). It offers customizable puzzles, difficulty levels, and 11 serialization formats to support structured data reasoning and Chain of Thought (CoT) experiments.
Post Training Android Part 4 for Software Laboratory Center 19-2 Binus University
Code repository for "Post-pre-training for Modality Alignment in Vision-Language Foundation Models" (CVPR2025)
Post Training Android Part 2 for Software Laboratory Center 19-2 Binus University
Add a description, image, and links to the post-training topic page so that developers can more easily learn about it.
To associate your repository with the post-training topic, visit your repo's landing page and select "manage topics."