Pinned Loading
-
open-compass/opencompass
open-compass/opencompass PublicOpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
-
PremiLab-Math/MathCheck
PremiLab-Math/MathCheck Public[ICLR 2025] Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
-
NLP2CT/kNN-TL
NLP2CT/kNN-TL Public[ACL 2023] kNN-TL: k-Nearest-Neighbor Transfer Learning for Low-Resource Neural Machine Translation
-
NLP2CT/UaIT
NLP2CT/UaIT Public[EMNLP 2024] Can LLMs Learn Uncertainty on Their Own? Expressing Uncertainty Effectively in A Self-Training Manner
Python 1
-
DevoAllen/Awesome-Reasoning-Economy-Papers
DevoAllen/Awesome-Reasoning-Economy-Papers PublicHarnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.