Skip to content
View hctian713's full-sized avatar
🫥
Indulge in coding
🫥
Indulge in coding

Highlights

  • Pro

Block or report hctian713

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Wan: Open and Advanced Large-Scale Video Generative Models

Python 4,393 361 Updated Feb 28, 2025

Official Repository of paper OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Python 68 Updated Feb 26, 2025

Reasoning in LLMs: Papers and Resources, including Chain-of-Thought, OpenAI o1, and DeepSeek-R1 🍓

2,671 152 Updated Feb 21, 2025

The Next Step Forward in Multimodal LLM Alignment

Python 104 2 Updated Feb 27, 2025

[ICLR 2025 Oral] The official implementation of "Diffusion-Based Planning for Autonomous Driving with Flexible Guidance"

Python 261 30 Updated Feb 11, 2025

[ICLR 2025] The official implementation of SSR

83 6 Updated Feb 4, 2025

PLUTO: Push the Limit of Imitation Learning-based Planning for Autonomous Driving

Python 323 50 Updated Jul 15, 2024

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Python 7,594 486 Updated Feb 28, 2025

World's First Large-scale High-quality Robotic Manipulation Benchmark

Python 1,414 95 Updated Feb 26, 2025

[RSS 2024] Learning Manipulation by Predicting Interaction

Python 100 1 Updated Aug 18, 2024

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,119 160 Updated Feb 13, 2025

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,908 1,057 Updated Feb 27, 2025

[CVPR 2025] Truncated Diffusion Model for Real-Time End-to-End Autonomous Driving

Python 460 23 Updated Feb 27, 2025
Python 146 7 Updated Jul 12, 2024

[NeurIPS 2024] Behavioral Topology (BeTop), a multi-agent behavior formulation for interactive motion prediction and planning

Python 106 6 Updated Nov 12, 2024

Align Anything: Training All-modality Model with Feedback

Python 2,412 336 Updated Feb 19, 2025

[NeurIPS 2024] NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking

Python 379 27 Updated Jan 17, 2025

[NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation

Python 98 5 Updated Dec 5, 2024

The official repository of our survey paper: "Towards a Unified View of Preference Learning for Large Language Models: A Survey"

161 2 Updated Oct 28, 2024

Reading notes about Multimodal Large Language Models, Large Language Models, and Diffusion Models

271 11 Updated Feb 18, 2025

✨✨ [ICLR 2025] MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?

Python 93 7 Updated Feb 14, 2025
JavaScript 1 1 Updated Feb 13, 2025

The devkit of the nuPlan dataset.

Python 771 158 Updated Dec 12, 2024

Enhancing End-to-End Autonomous Driving with Latent World Model

106 4 Updated Jun 13, 2024

✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models

Python 152 7 Updated Dec 26, 2024

Can 3D Vision-Language Models Truly Understand Natural Language?

21 Updated Mar 28, 2024

[ECCV 2024] Embodied Understanding of Driving Scenarios

Python 178 14 Updated Jan 2, 2025

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Python 5,815 378 Updated Mar 14, 2024

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 29,851 4,063 Updated Jul 17, 2024
Next
Showing results