The workshop aims to explore the use of synthetic data in training and evaluating computer vision models, as well as in other related domains. During the last decade, advancements in computer vision were catalyzed by the release of painstakingly curated human-labeled datasets. Recently, people have increasingly resorted to synthetic data as an alternative to laborintensive human-labeled datasets for its scalability, customizability, and costeffectiveness. Synthetic data offers the potential to generate large volumes of diverse and high-quality vision data, tailored to specific scenarios and edge cases that are hard to capture in real-world data. However, challenges such as the domain gap between synthetic and real-world data, potential biases in synthetic generation, and ensuring the generalizability of models trained on synthetic data remain. We hope the workshop can provide a forum to discuss and encourage further exploration in these areas.

Invited Speakers

Ani	Kembhavi
Ani Kembhavi
Allen Institute for AI (AI2)
Jia Deng
Jia Deng
Princeton University
Ludwig Schmidt
Ludwig Schmidt
University of Washington
Ming Lin
Ming Lin
University of Maryland
Ruslan Salakhutdinov
Ruslan Salakhutdinov
Carnegie Mellon University
Yale Song
Yale Song
Yannis Kalantidis
Yannis Kalantidis


Times (PST)Event
09:00 - 09:10Opening
09:10 - 09:50Talk by Ludwig Schmidt
09:50 - 10:30Talk by Ruslan Salakhutdinov
10:30 - 10:50Break
10:50 - 11:30Talk by Yale Song
11:30 - 12:10Talk by Jia Deng
12:10 - 13:30Lunch
13:30 - 14:30Poster Session
14:30 - 15:10Talk by Ani Kembhavi
15:10 - 15:50Talk by Ming Lin
15:50 - 16:10Break
16:10 - 16:50Talk by Yannis Kalantidis
16:50 - 17:05Oral Presentation: CinePile: A Long Video Question Answering Dataset and Benchmark
17:05 - 17:20Oral Presentation: GenAI-Bench: A Holistic Benchmark for Compositional Text-to-Visual Generation
17:20 - 17:30Closing

Poster Session

Notice: The location of the poster session is different from the workshop.

Accepted Papers

DDOS: The Drone Depth and Obstacle Segmentation Dataset. Benedikt Kolbeinsson, Krystian Mikolajczyk
From NeRF to 3DGS: A Leap in Stereo Dataset Quality?. Magnus Kaufmann Gjerde, Filip Slezák, Joakim Bruslund Haurum, Thomas B. Moeslund
Training Robust Classifiers with Diffusion Denoised Examples. Chandramouli Shama Sastry, Sri Harsha Dumpala, Sageev Oore
Uncertainty Inclusive Contrastive Learning for Leveraging Synthetic Images. Fiona Cai, Emily Mu, John Guttag
HDL-SAM: A Hybrid Deep Learning Framework for High-Resolution Imaging in Scanning Acoustic Microscopy. Akshit Sharma, Ayush Somani, Pragyan Banerjee, Frank Melandsø, Anowarul Habib
MICDrop: Masking Image and Depth Features via Complementary Dropout for Domain-Adaptive Semantic Segmentation. Linyan Yang, Lukas Hoyer, Mark Weber, Tobias Fischer, Dengxin Dai, Laura Leal-Taixé, Daniel Cremers, Marc Pollefeys, Luc Van Gool
An Approach to Synthesize Thermal Infrared Ship Images. Doan Thinh Vo, Phan Anh Đức, Nguyen Nhu Thao, Huong Ninh
LAESI: Leaf Area Estimation with Synthetic Imagery. Jacek Kałużny, Yannik Schreckenberg, Karol Cyganik, Peter Annighöfer, Soren Pirk, Dominik Michels, Mikolaj Cieslak, Farhah Assaad, Bedrich Benes, Wojtek Palubicki
GenAI-Bench: A Holistic Benchmark for Compositional Text-to-Visual Generation. Baiqi Li, Zhiqiu Lin, Deepak Pathak, Jiayao Emily Li, Xide Xia, Graham Neubig, Pengchuan Zhang, Deva Ramanan
SEVD: Synthetic Event-based Vision Dataset for Ego and Fixed Traffic Perception. Manideep Reddy Aliminati, Bharatesh Chakravarthi, Aayush Atul Verma, Arpitsinh Vaghela, Hua Wei, Xuesong Zhou, Yezhou Yang
Training with Real instead of Synthetic Generated Images Still Performs Better. Scott Geng, Ranjay Krishna, Pang Wei Koh
A Neural Model for High-Performance Scanning Electron Microscopy Image Simulation of Porous Materials. Tim Dahmen, Markus Kronenberger, Niklas Rottmayer, Katja Schladitz, Claudia Redenbach
S2MGen: A Synthetic Skin Mask Generator for Improving Segmentation. Subhadra Gopalakrishnan, Trisha Mittal, Jaclyn Pytlarz, Yuheng Zhao
Self-Distillation on Conditional Spatial Activation Maps for ForeGround-BackGround Segmentation. Yeruru Asrar Ahmed, Anurag Mittal
GeomVerse: A Systematic Evaluation of Large Models for Geometric Reasoning. Mehran Kazemi, Hamidreza Alvari, Ankit Anand, Jialin Wu, Xi Chen, Radu Soricut
CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion. Geonmo Gu, Sanghyuk Chun, Wonjae Kim, HeeJae Jun, Yoohoon Kang, Sangdoo Yun
Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video. Hongchi Xia, Zhi-Hao Lin, Wei-Chiu Ma, Shenlong Wang
UrbanIR: Large-Scale Urban Scene Inverse Rendering from a Single Video. Zhi-Hao Lin, Bohan Liu, Yi-Ting Chen, David Forsyth, Jia-Bin Huang, Anand Bhattad, Shenlong Wang
DISC: Latent Diffusion Models with Self-Distillation from Separated Conditions for Prostate Cancer Grading. Man M. Ho, Elham Ghelichkhan, Yosep Chong, Yufei Zhou, Beatrice S. Knudsen, Tolga Tasdizen
On the Equivalency, Substitutability, and Flexibility of Synthetic Data. Che-Jui Chang, Danrui Li, Seonghyeon Moon, Mubbasir Kapadia
Beyond Internet Images: Evaluating Vision-Language Models for Domain Generalization on Synthetic-to-Real Industrial Datasets. Louis Hémadou, Héléna Vorobieva, Ewa Kijak, Frederic Jurie
DiffInject: Revisiting Debias via Synthetic Data Generation using Diffusion-based Style Injection. Donggeun Ko, Sangwoo Jo, Dongjun Lee, Namjun Park, Jaekwang KIM
Balancing Quality and Quantity: The Impact of Synthetic Data on Smoke Detection Accuracy in Computer Vision. Ethan Seefried, Changsoo Jung, Jack Fitzgerald, Mariah Bradford, Trevor Chartier, Nathaniel Blanchard
Object-Conditioned Energy-Based Model for Attention Map Alignment in Text-to-Image Diffusion Models. Yasi Zhang, Peiyu Yu, Ying Nian Wu
DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control. Yuru Jia, Lukas Hoyer, Shengyu Huang, Tianfu Wang, Luc Van Gool, Konrad Schindler, Anton Obukhov
CinePile: A Long Video Question Answering Dataset and Benchmark. Ruchit Rawal, Khalid Saifullah, Ronen Basri, David Jacobs, Gowthami Somepalli, Tom Goldstein
m&m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks. Zixian Ma, Weikai Huang, Jieyu Zhang, Tanmay Gupta, Ranjay Krishna
Harlequin: Color-driven Generation of Synthetic Data for Referring Expression Comprehension. Luca Parolari, Elena Izzo, Lamberto Ballan
Inclusive Portrait Lighting Estimation Model Leveraging Graphic-Based Synthetic Data. Kin Ching Lydia Chau, Tao LI, Ruowei Jiang, Zhi Yu, Panagiotis-Alexandros Bokaris
Attributed Synthetic Data Generation for Zero-shot Image Classification. Shijian Wang, Linxin Song, Ryotaro Shimizu, Masayuki Goto, Hanqian wu
A Benchmark Synthetic Dataset for C-SLAM in Service Environments. Harin Park, Inha Lee, Minje Kim, Hyungyu Park, Kyungdon Joo
Compositional Learning of Visually-Grounded Concepts Using Reinforcement. Zijun Lin, Haidi Azaman, M Ganesh Kumar, Cheston Tan
Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models. Yushi Hu, Otilia Stretcu, Chun-Ta Lu, Krishnamurthy Viswanathan, Kenji Hata, Enming Luo, Ranjay Krishna, Ariel Fuxman
DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback. Jiao Sun, Deqing Fu, Yushi Hu, Su Wang, Royi Rassin, Da-Cheng Juan, Dana Alon, Charles Herrmann, Sjoerd van Steenkiste, Ranjay Krishna, Cyrus Rashtchian
SIFTer: Self-improving Synthetic Datasets for Pre-training Classification Models. Ryo Hayamizu, Shota Nakamura, Sora Takashima, Hirokatsu Kataoka, Ikuro Sato, Nakamasa Inoue, Rio Yokota
R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding. Qirui Wu, Sonia Raychaudhuri, Daniel Ritchie, Manolis Savva, Angel X Chang
Intrinsic LoRA: A Generalist Approach for Discovering Knowledge in Generative Models. Xiaodan Du, Nicholas Kolkin, Greg Shakhnarovich, Anand Bhattad
XIMAGENET-12: An Explainable Visual Benchmark Dataset for Model Robustness Evaluation. Qiang Li, Dan Zhang, Shengzhao Lei, Xun Zhao, WeiWei Li, Porawit Kamnoedboon, Junhao Dong, Shuyan Li
Paved2Paradise: Cost-Effective and Scalable LiDAR Simulation by Factoring the Real World. Michael A. Alcorn, Noah Schwartz
Virtually Enriched NYU Depth V2 Dataset for Monocular Depth Estimation: Do We Need Artificial Augmentation?. Dmitry Yu. Ignatov, Andrey Ignatov, Radu Timofte
SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training?. Hasan Abed Al Kader Hammoud, Hani Itani, Fabio Pizzati, Adel Bibi, Bernard Ghanem
Implicit Neural Clustering. Thomas Kreutz, Max Mühlhäuser, Alejandro Sanchez Guinea

Call for Papers

We invite papers on the use of synthetic data for training and evaluating computer vision models. We welcome submissions along two tracks:

Accepted papers will be allocated a poster presentation and displayed on the workshop website. In addition, we will offer a Best Long Paper award, Best Paper Runner-up award, and Best Short Paper with oral presentation.


Potential topics include, but are not limited to:

Submission Instructions

Submissions should be anonymized and formatted using the CVPR 2024 template and uploaded as a single PDF. Note that our workshop is non-archival.

Submission link: OpenReview Link

Important workshop dates

Related Workshops


Jieyu Zhang
Jieyu Zhang
University of Washington
Cheng-Yu Hsieh
Cheng-Yu Hsieh
University of Washington
Zixian Ma
Zixian Ma
University of Washington
Shobhita Sundaram
Shobhita Sundaram
Massachusetts Institute of Technology
Weikai Huang
Weikai Huang
University of Washington
Wei-Chiu Ma
Wei-Chiu Ma
Cornell University
Phillip Isola
Phillip Isola
Massachusetts Institute of Technology
Ranjay Krishna
Ranjay Krishna
University of Washington