Skip to content

jiaosiyuu/ThinkGen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ThinkGen: Generalized Thinking for Visual Generation

Siyu Jiao1, Yiheng Lin1, Yujie Zhong2, Qi She2, Wei Zhou2, Xiaohan Lan2
Zilong Huang2, Fei Yu2, Yingchen Yu2, Yunqing Zhao2, Yao Zhao1, Yunchao Wei1

1 Beijing Jiaotong University, 2 Bytedance

arXiv  huggingface weights 

🚀 Quick Start

🛠️ Environment Setup

✅ Recommended Setup

# 1. Clone the repo
git clone https://github.com/jiaosiyuu/ThinkGen.git
cd OmniGen2

# 2. (Optional) Create a clean Python environment
conda create -n thinkgen python=3.11
conda activate thinkgen

# 3. Install dependencies
# 3.1 Install PyTorch (choose correct CUDA version)
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu124

# 3.2 Install other required packages
pip install -r req.txt

# ThinkGen runs even without flash-attn, though we recommend install it for best performance.
pip install  --no-cache-dir flash-attn==2.7.4.post1 --no-build-isolation

🌏 For users in Mainland China

pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://mirror.sjtu.edu.cn/pytorch-wheels/cu124
pip install -r req.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install  --no-cache-dir flash-attn==2.7.4.post1 --no-build-isolation -i https://pypi.tuna.tsinghua.edu.cn/simple

  • Run Locally:
from ThinkGen.model import ThinkGen_Chat
import os

chat_model = ThinkGen_Chat(
    model_path="JSYuuu/ThinkGen",
    dtype='bf16',
    height=1024,
    width=1024
)


## Gen
messages = [
    {"type": "text", "value": '''A young woman wearing a straw hat, standing in a golden wheat field.'''}
]
results = chat_model.generate_image(messages)
output_dir = "vis/chat"
os.makedirs(output_dir, exist_ok=True)

for i, img in enumerate(results.images):
    save_path = os.path.join(output_dir, f"result_{i}.png")
    img.save(save_path)
    print(f"Saved to {save_path}")


## Gen-Think
messages = [
    {"type": "text", "value": '''A young woman wearing a straw hat, standing in a golden wheat field.'''}
]
results = chat_model.generate_image(messages, think=True)
output_dir = "vis/chat"
os.makedirs(output_dir, exist_ok=True)

print(f"cot & rewrite prompt: \n{results.prompt_cot}")
for i, img in enumerate(results.images):
    save_path = os.path.join(output_dir, f"result_think_{i}.png")
    img.save(save_path)
    print(f"Saved to {save_path}")


## Und
messages = [
    {"type": "image", "value": "images/teaser.png"},
    {"type": "text", "value": "Describe this image"}
]

response = chat_model.generate_text(messages)
print(response)

Acknowledgments

This work builds upon the following great open-source projects:

About

ThinkGen: Generalized Thinking for Visual Generation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages