CaraJ7

Dongzhi Jiang CaraJ7

85 followers · 51 following

MMLab, CUHK
Hong Kong, China
https://caraj7.github.io/

Achievements

Organizations

Stars

PicoTrex / Mind-Brush

Implement search image generation similar to Nano-banana-pro / Seedream / FLUX.

Python 76 1 Updated Feb 3, 2026

Ivan-Tang-3D / 3DGen-R1

[CVPR 2026] The official implementation of The paper "Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation"

Python 105 1 Updated Feb 28, 2026

yejy53 / RealGen

RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards.

Python 318 26 Updated Dec 29, 2025

CaraJ7 / DraCo

Offical Repository for Paper: DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation

17 Updated Dec 7, 2025

ZiyuGuo99 / Thinking-while-Generating

The first Interleaved framework for textual reasoning within the visual generation process

158 1 Updated Nov 21, 2025

ZiyuGuo99 / MME-CoF

Are Video Models Ready as Zero-shot Reasoners?

Python 84 4 Updated Nov 24, 2025

ULMEvalKit / ULMEvalKit

ULMEvalKit: One-Stop Eval ToolKit for Image Generation

Python 56 2 Updated Dec 17, 2025

gilbarbara / logos

A huge collection of SVG logos

SVG 6,681 753 Updated Jul 21, 2025

FlyMyAI / flymyai-lora-trainer

Qwen-Image text to image lora trainer

Python 716 63 Updated Dec 16, 2025

PicoTrex / Awesome-Nano-Banana-images

A curated collection of fun and creative examples generated with Nano Banana & Nano Banana Pro🍌, Gemini-2.5-flash-image based model. We also release Nano-consistent-150K openly to support the commu…

21,125 2,174 Updated Dec 12, 2025

yejy53 / Echo-4o

Echo-4o: Harnessing Proprietary Models’ Synthetic Images for Improved Image Generation

Jupyter Notebook 503 28 Updated Dec 9, 2025

LongHZ140516 / PaperGallery

A curated gallery and toolkit designed to provide inspiration for scientific illustrations, project sites, and visual storytelling in research.

977 28 Updated Feb 10, 2026

christophschuhmann / improved-aesthetic-predictor

CLIP+MLP Aesthetic Score Predictor

Python 1,265 113 Updated Jul 1, 2024

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,710 504 Updated Oct 27, 2025

ziqipang / RandAR

[CVPR 2025 (Oral)] Open implementation of "RandAR"

Python 207 6 Updated Jul 14, 2025

guandeh17 / Self-Forcing

Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)

Python 3,175 245 Updated Sep 12, 2025

xinyan-cxy / MINT-CoT

[NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning

Python 101 5 Updated Sep 19, 2025

zhaochen0110 / Awesome_Think_With_Images

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

1,346 41 Updated Feb 3, 2026

VIPL-GENUN / Jodi

Jodi: Unification of Visual Generation and Understanding via Joint Modeling

Python 90 2 Updated Jun 19, 2025

shilinyan99 / CrossLMM

CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms

25 Updated Dec 21, 2025

selftok-team / SelftokTokenizer

Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning

Python 237 8 Updated May 30, 2025

StarsfieldAI / R1-V

Witness the aha moment of VLM with less than $3.

Python 4,036 287 Updated May 19, 2025

LLaVA-VL / LLaVA-NeXT

Python 4,578 448 Updated Sep 14, 2025

AIDC-AI / Awesome-Unified-Multimodal-Models

Awesome Unified Multimodal Models

1,131 36 Updated Feb 6, 2026

PicoTrex / GPT-ImgEval

GPT-ImgEval: Evaluating GPT-4o’s state-of-the-art image generation capabilities

Python 305 8 Updated May 3, 2025

opendatalab / LOKI

[ICLR 2025 Spotlight] The official implementation of the paper “LOKI：A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models”

Python 175 4 Updated Feb 7, 2026

Diffusion-CoT / ReflectionFlow

[ICCV 2025] Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning

Python 217 12 Updated Nov 5, 2025

CaraJ7 / T2I-R1

[NeurIPS 2025] T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

Python 430 24 Updated Sep 18, 2025

xie-lab-ml / awesome-alignment-of-diffusion-models

[ACM Computing Surveys] The collection of awesome papers on alignment of diffusion models.

406 17 Updated Feb 6, 2026

EvolvingLMMs-Lab / multimodal-search-r1

MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tools.

Python 408 21 Updated Aug 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dongzhi Jiang CaraJ7

Achievements

Achievements

Organizations

Block or report CaraJ7

Stars

PicoTrex / Mind-Brush

Ivan-Tang-3D / 3DGen-R1

yejy53 / RealGen

CaraJ7 / DraCo

ZiyuGuo99 / Thinking-while-Generating

ZiyuGuo99 / MME-CoF

ULMEvalKit / ULMEvalKit

gilbarbara / logos

FlyMyAI / flymyai-lora-trainer

PicoTrex / Awesome-Nano-Banana-images

yejy53 / Echo-4o

LongHZ140516 / PaperGallery

christophschuhmann / improved-aesthetic-predictor

ByteDance-Seed / Bagel

ziqipang / RandAR

guandeh17 / Self-Forcing

xinyan-cxy / MINT-CoT

zhaochen0110 / Awesome_Think_With_Images

VIPL-GENUN / Jodi

shilinyan99 / CrossLMM

selftok-team / SelftokTokenizer

StarsfieldAI / R1-V

LLaVA-VL / LLaVA-NeXT

AIDC-AI / Awesome-Unified-Multimodal-Models

PicoTrex / GPT-ImgEval

opendatalab / LOKI

Diffusion-CoT / ReflectionFlow

CaraJ7 / T2I-R1

xie-lab-ml / awesome-alignment-of-diffusion-models

EvolvingLMMs-Lab / multimodal-search-r1