Skip to content

CV-xueba/PICD_ImageComposition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

29 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

PICD: Photographic Image Composition Dataset

Can Machines Understand Composition? A Dataset and Benchmark for Photographic Image Composition Embedding and Understanding
πŸ“Œ CVPR 2025 Highlight

πŸ“„ CVPR 2025 Paper
πŸ“‘ Supplementary Appendix


πŸ“Œ Overview

PICD is a large-scale dataset for photographic image composition analysis, currently containing 49,123 high-quality images annotated with 24 composition categories.

This dataset is intended to support the evaluation and advancement of composition learning in AI models. It is applicable to a wide range of tasks, including aesthetic quality assessment, composition-aware image cropping, and more. We encourage researchers and practitioners to explore creative uses of PICD.

The composition label system is structured along two axes:

  • Element Types: Points, Lines, and Shapes (inspired by Kandinsky’s principles)
  • Arrangement Patterns: Rule of Thirds, Centered, Diagonal, Vertical, Horizontal, Triangle, C-curve, O-curve, S-curve, Radial, Dense, Scatter, etc.

πŸ“– For detailed category definitions and label design, please refer to the Appendix.

Label System Figure

Figure 1. The PICD label system is structured along two axes: element types and arrangement patterns. Column 1 (green) shows arrangement types; Columns 2–4 show compositional element types. Categories are numbered 1–24 with abbreviations in blue. Red boxes indicate merged categories; blue strikethroughs mark excluded ones due to low frequency. Column 5 highlights dominant compositional factors.

Category Sample Figure

Figure 2. Sample images from the 24 composition categories in PICD. Category abbreviations appear in blue parentheses.


πŸ“Š Dataset Information

PICD is actively maintained and will continue to be expanded. The current release includes:

  • βœ… 49,123 images
  • βœ… Verified composition category annotations
  • ⏳ Negative samples (images not conforming to any predefined category) β€” coming soon
  • ⏳ Composition quality scores β€” coming soon
  • ⏳ Textual composition descriptions β€” coming soon

πŸ”— Download

PICD consists of both image files and annotations.

1. Images

1) Direct Access:

Image download is divided into two parts based on licensing:

Part 1: Images with redistribution permission

Part 2: Images requiring user-side access

  • This includes 4546 images from public datasets that do not permit redistribution (e.g., AVA).
  • For these, we provide a mapping file that links each PICD-assigned image ID to the original dataset image ID or URL. You may download the original images from their respective sources using this mapping:
    πŸ‘‰ Image ID Mapping File

2) Alternative Access:
If you prefer to request both parts directly via email (especially for convenience or if you encounter access issues), please send a message to picd2025@outlook.com with your affiliation and intended use. We will respond with the download links after reviewing your request.

Accessing or using the dataset in any way implies agreement to the
πŸ“„ PICD Dataset Terms of Use (PDF)

2. Annotations

  • ✏️ Image Annotation File
    πŸ‘‰ Download Annotations
    This CSV file contains the following fields:
    • img_id: The PICD image ID
    • category_id: Index of the composition category (1–24)
    • category_abbre: Abbreviated category label (as shown in Figure 2)
    • category_full_name: Full name of the composition category

The mapping among category_id, category_abbre, and category_full_name follows the structure shown in Figure 1.


πŸ“„ License and Terms

PICD is released under the
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

All users must also agree to the dataset-specific terms of use:
πŸ“„ PICD Dataset Terms of Use (PDF)


πŸ”§ Citation

If you use PICD in your research, please cite:

@inproceedings{zhao2025can,
  title={Can Machines Understand Composition? Dataset and Benchmark for Photographic Image Composition Embedding and Understanding},
  author={Zhao, Zhaoran and Lu, Peng and Zhang, Anran and Li, Peipei and Li, Xia and Liu, Xuannan and Hu, Yang and Chen, Shiyi and Wang, Liwei and Guo, Wenhao},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={14411--14421},
  year={2025}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published