Skip to content
Change the repository type filter

All

    Repositories list

    • TokLIP

      Public
      TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
      Other
      01310Updated May 9, 2025May 9, 2025
    • GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
      Python
      Other
      827420Updated Apr 28, 2025Apr 28, 2025
    • ColorFlow

      Public
      The official implementation of paper "ColorFlow: Retrieval-Augmented Image Sequence Colorization"
      Python
      Other
      33408100Updated Apr 16, 2025Apr 16, 2025
    • AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
      Python
      Other
      2531251Updated Apr 9, 2025Apr 9, 2025
    • [SIGGRAPH2025] Official repo for paper "Any-length Video Inpainting and Editing with Plug-and-Play Context Control"
      Python
      Other
      18358160Updated Apr 8, 2025Apr 8, 2025
    • Python
      Apache License 2.0
      17900Updated Apr 5, 2025Apr 5, 2025
    • DiTCtrl

      Public
      [CVPR 2025] Official code of "DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation"
      Python
      Other
      626330Updated Mar 30, 2025Mar 30, 2025
    • Moto

      Public
      Latent Motion Token as the Bridging Language for Robot Manipulation
      Python
      Other
      18520Updated Mar 24, 2025Mar 24, 2025
    • DI-PCG

      Public
      Code release of our paper "DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation".
      Python
      Other
      210020Updated Mar 23, 2025Mar 23, 2025
    • BlobCtrl

      Public
      [Arxiv'25] BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing
      Python
      Other
      28810Updated Mar 20, 2025Mar 20, 2025
    • Divot

      Public
      Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)
      Python
      Other
      16710Updated Feb 27, 2025Feb 27, 2025
    • SEED-Voken: A Series of Powerful Visual Tokenizers
      Python
      Apache License 2.0
      3187731Updated Feb 19, 2025Feb 19, 2025
    • Official Code for MotionCtrl [SIGGRAPH 2024]
      Python
      Apache License 2.0
      761.4k280Updated Feb 19, 2025Feb 19, 2025
    • ViT-Lens

      Public
      [CVPR 2024] ViT-Lens: Towards Omni-modal Representations
      Python
      Other
      917540Updated Feb 3, 2025Feb 3, 2025
    • A framework to convert any 2D videos to immersive stereoscopic 3D
      Python
      Other
      27325191Updated Jan 7, 2025Jan 7, 2025
    • InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
      Python
      Apache License 2.0
      4183.8k1094Updated Jan 3, 2025Jan 3, 2025
    • BrushEdit

      Public
      [TPAMI under review] The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"
      Python
      Other
      27557110Updated Dec 26, 2024Dec 26, 2024
    • FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
      JavaScript
      Other
      1017040Updated Dec 23, 2024Dec 23, 2024
    • BrushNet

      Public
      [ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
      Python
      Other
      1311.6k520Updated Dec 17, 2024Dec 17, 2024
    • [CVPR 2025] Boosting Generative Novel View Synthesis with Sparse and Unposed Images
      Python
      Other
      611030Updated Dec 9, 2024Dec 9, 2024
    • FluxKits

      Public
      Python
      Apache License 2.0
      58641Updated Nov 27, 2024Nov 27, 2024
    • PhotoMaker [CVPR 2024]
      Jupyter Notebook
      Other
      7919.9k1466Updated Oct 31, 2024Oct 31, 2024
    • SEED-Story: Multimodal Long Story Generation with Large Language Model
      Python
      Other
      6584050Updated Oct 11, 2024Oct 11, 2024
    • ST-LLM

      Public
      [ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"
      Python
      Apache License 2.0
      5145100Updated Sep 10, 2024Sep 10, 2024
    • mllm-npu

      Public
      mllm-npu: training multimodal large language models on Ascend NPUs
      Python
      Apache License 2.0
      29030Updated Aug 29, 2024Aug 29, 2024
    • MasaCtrl

      Public
      [ICCV 2023] Consistent Image Synthesis and Editing
      Python
      Apache License 2.0
      31792232Updated Aug 19, 2024Aug 19, 2024
    • Plot2Code

      Public
      Python
      31900Updated Aug 17, 2024Aug 17, 2024
    • GFPGAN

      Public
      GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
      Python
      Other
      6.1k37k36126Updated Jul 26, 2024Jul 26, 2024
    • CustomNet

      Public
      Python
      Apache License 2.0
      1127271Updated Jul 22, 2024Jul 22, 2024
    • T2I-Adapter
      Python
      2203.7k886Updated Jun 21, 2024Jun 21, 2024