- 🌱 I’m a PhD candidate at Nanyang Technological University and the Institute for Infocomm Research (I2R), A*STAR.
- 🔭 I’m currently working on machine reasoning with LLMs.
Here are some projects that I lead or contribute to:
- PFPO: The source code for our work: Preference Optimization for Reasoning with Pseudo Feedback (ICLR 2025).
- ProcessDPO: The source code for our EMNLP work with outstanding paper award: Learning Planning-based Reasoning by Trajectories Collection and Process Reward Synthesizing.
- LLMSanitize: Our developed tool for detection of data contamination in Large Langauge Models.
- llama-pipeline-parallel: A prototype and tutorial for deepspeed pipeline parallel training on Llama.
- UNK-VQA: Our collected dataset for benchmarking the abstention ability of multimodal large models.
- PandaLLM: One of the earliest projects exploring continue pre-training of Llama on Chinese corpus.
- LogicLLM: One self-supervised pre-training method to enrich the logic-relevant knowledge of large language models.
Here are some projects for machine reading comprehension and information retrieval in the era of encoder-only pre-trained LMs: