Overview

The workshop aims to explore the use of synthetic data in training and evaluating computer vision models, as well as in other related domains. During the last decade, advancements in computer vision were catalyzed by the release of painstakingly curated human-labeled datasets. Recently, people have increasingly resorted to synthetic data as an alternative to laborintensive human-labeled datasets for its scalability, customizability, and costeffectiveness. Synthetic data offers the potential to generate large volumes of diverse and high-quality vision data, tailored to specific scenarios and edge cases that are hard to capture in real-world data. However, challenges such as the domain gap between synthetic and real-world data, potential biases in synthetic generation, and ensuring the generalizability of models trained on synthetic data remain. We hope the workshop can provide a forum to discuss and encourage further exploration in these areas.

Invited Speakers

The speakers haven’t been finalized, stay tuned for updates!

Angela Dai
Technical University of Munich

Bharath Hariharan
Cornell University

Bolei Zhou
University of California, Los Angeles

Jia Deng
Princeton University

Kiana Ehsani
Vercept

Yael Vinker
Massachusetts Institute of Technology

Schedule

Workshop date: June 11th, 2025 (Full day)
Location: Grand C2 (talk) Exhall D (Poster)

09:00 - 09:20	Opening (Jieyu Zhang)
09:20 - 10:00	Talk by Yael Vinker
10:00 - 10:40	Talk by Kiana Ehsani
10:40 - 11:00	Break
11:00 - 11:40	Talk by Jia Deng
11:40 - 13:00	Lunch
13:00 - 14:00	Poster Session (In Exhall D)
14:30 - 15:10	Talk by Bolei Zhou
15:10 - 15:50	Talk by Bharath Hariharan
15:50 - 16:10	Break
16:10 - 16:50	Talk by Angela Dai
16:50 - 17:20	Closing (Zixian Ma)

Poster Session

Poster Numbers: #208 - #268 (Poster numbers for each accepted papers are listed below)
Time: 13:00 - 14:00 (1:00 PM - 2:00 PM)
Location: Exhall D
Poster Requirements: Please follow the CVPR 2025 main conference poster requirements

Awards

Best Long Paper

DemoGen: Synthetic Demonstration Generation for Data-Efficient Visuomotor Policy Learning
Zhengrong Xue, Shuying Deng, Zhenyang Chen, Yixuan Wang, Zhecheng Yuan, Huazhe Xu

Best Long Paper Runner-up

Buildee: A 3D Simulation Framework for Scene Exploration and Reconstruction with Understanding
Clémentin Boittiaux, Vincent Lepetit

Best Short Paper

BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions
Anas Awadalla, Le Xue, Manli Shu, An Yan, Jun Wang, Senthil Purushwalkam, Sheng Shen, Hannah Lee, Oscar Lo, Jae Sung Park, Etash Kumar Guha, Silvio Savarese, Ludwig Schmidt, Yejin Choi, Caiming Xiong, +1 more author

Best Short Paper Runner-up

Learning to Blur is Learning to Deblur: Realistic Synthetic UHD Blurred Image via Diffusion
Xin Su, Xiuyi Jia, Chen Wu, Dianjie Lu, Guijuan Zhang, Yang Wen, Zhuoran Zheng

Accepted Papers (sorted by alphabet)

ACTUPose: Active Curriculum Training for Unsupervised Domain Adaptation in Pose Estimation. Isha Dua, Arjun Sharma, Shuaib Ahmed, Rahul Tallamraju (Poster #208)

Analyze, Generate, Improve: Failure-Based Data Generation for Large Multimodal Models. Gabriela Ben-Melech Stan, Estelle Aflalo, Avinash Madasu, Vasudev Lal, Phillip Howard (Poster #209)

Ano-Skin: Clinical Feature-Aware Diffusion Model for Dermatological Image Anonymization. YeonGyu Han, Jung Im Na, Seong Hwan Kim, Dongheon Lee (Poster #210)

Applying Longitudinal Augmentation and Data Generation (LAUGEN) in Medical Imaging. Nico Disch, Balint Kovacs, Constantin Ulrich, Robin Peretzke, Saikat Roy, Maximilian Rouven Rokuss, Yannick Kirchhoff, David Zimmerer, Klaus Maier-Hein (Poster #211)

Are Synthetic Corruptions A Reliable Proxy For Real-World Corruptions?. Shashank Agnihotri, David Schader, Nico Sharei, Mehmet Ege Kaçar, Margret Keuper (Poster #212)

Ask, Pose, Unite: Scaling Data Acquisition for Close Interaction Meshes with Vision Language Models. Laura Bravo-Sánchez, Jaewoo Heo, Zhenzhen Weng, Kuan-Chieh Wang (Poster #213)

BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions. Anas Awadalla, Le Xue, Manli Shu, An Yan, Jun Wang, Senthil Purushwalkam, Sheng Shen, Hannah Lee, Oscar Lo, Jae Sung Park, Etash Kumar Guha, Silvio Savarese, Ludwig Schmidt, Yejin Choi, Caiming Xiong, +1 more author (Poster #214)

Boosting Synthetic Data for VLMs via Diffusion Noise Optimization. Ren Ohkubo, Rintaro Yanagi, Hirokatsu Kataoka, Yutaka Satoh (Poster #215)

Bounding Box-Guided Diffusion for Synthesizing Industrial Images and Segmentation Maps. Alessandro Simoni, Francesco Pelosin (Poster #216)

BRADD: Balancing Representations with Anomaly Detection and Diffusion. Filipe Laitenberger, Nesta Midavaine, Ioana Simion, Stefan Vasilev, Hirokatsu Kataoka, Cees G. M. Snoek, Yuki M Asano, Mohammadreza Salehi (Poster #217)

Bridging the Domain Gap: Enhancing Underwater Laser Stripe Segmentation with Synthetic Data. Javiera Fuentes-Guíñez, Giancarlo Troni (Poster #218)

Buildee: A 3D Simulation Framework for Scene Exploration and Reconstruction with Understanding. Clémentin Boittiaux, Vincent Lepetit (Poster #219)

Challenges in 3D Data Synthesis for Training Neural Networks on Topological Features. Dylan Peek, Matthew P. Skerritt, Siddharth Pritam, Stephan Chalup (Poster #220)

COMPACT: COMPositional Atomic-to-Complex Visual Capability Tuning. Xindi Wu, Hee Seung Hwang, Polina Kirichenko, Olga Russakovsky (Poster #221)

Concept-as-Tree: Synthetic Data is All You Need for VLM Personalization. Ruichuan An, Kai Zeng, Ming Lu, Sihan Yang, Renrui Zhang, Huitong Ji, Qizhe Zhang, Yulin Luo, Hao Liang, Wentao Zhang (Poster #222)

Corner Cases: How Size and Position of Objects Challenge ImageNet-Trained Models. Mishal Fatima, Steffen Jung, Margret Keuper (Poster #223)

DemoGen: Synthetic Demonstration Generation for Data-Efficient Visuomotor Policy Learning. Zhengrong Xue, Shuying Deng, Zhenyang Chen, Yixuan Wang, Zhecheng Yuan, Huazhe Xu (Poster #224)

Diffusion Deepfake. Chaitali Bhattacharyya, Hanxiao Wang, Feng Zhang, Sungho Kim, Xiatian Zhu (Poster #225)

DispBench: Benchmarking Disparity Estimation to Synthetic Corruptions. Shashank Agnihotri, Amaan Ansari, Annika Dackermann, Fabian Rösch, Margret Keuper (Poster #226)

DocGenie: A Framework for High-Fidelity Synthetic Document Generation via Seed-Guided Multimodal LLM and Document-Aware Evaluation. Harikrishnan P M, SIDDARTHA REDDY, Goutham Vignesh, Rohit Agrawal, Vishal Vaddina (Poster #227)

Does Feasibility Matter? Understanding the Impact of Feasibility on Synthetic Training Data. Yiwen Liu, Jessica Bader, Jae Myung Kim (Poster #228)

Evaluating Text-to-Image Diffusion Models for Texturing Synthetic Data. Thomas Lips, Francis Wyffels (Poster #229)

GASR: Generated Artwork dataset for Image Super-Resolution. Noritake Kodama, Go Ohtani, Yuto Matsuo, Rintaro Yanagi, Nakamasa Inoue, Yoshimitsu Aoki, Hirokatsu Kataoka (Poster #230)

Generate Any Scene: Evaluating and Improving Text-to-Vision Generation with Scene Graph Programming. Ziqi Gao, Weikai Huang, Jieyu Zhang, Aniruddha Kembhavi, Ranjay Krishna

Generating Synthetic Data via Augmentations for Improved Facial Resemblance in DreamBooth and InstantID. Koray Ulusan, Benjamin Kiefer (Poster #231)

Generating Synthetic Stereo Datasets using 3D Gaussian Splatting and Expert Knowledge Transfer. Filip Slezák, Magnus Kaufmann Gjerde, Joakim Bruslund Haurum, Ivan Nikolov, Morten Stigaard Laursen, Thomas B. Moeslund (Poster #232)

GIMO: Generative Image Outpainting for Early Smoke Segmentation. Sahir Shrestha, Weihao Li, Gao Zhu, Nick Barnes (Poster #233)

H2R: A Human-to-Robot Data Augmentation for Robot Pre-training from Videos. Guangrun Li, Yaoxu Lyu, Zhuoyang Liu, Chengkai Hou, Yinda Xu, Jieyu Zhang, Shanghang Zhang (Poster #234)

Harnessing Diffusion-Generated Synthetic Images for Fair Image Classification. Abhipsa Basu, Aviral Gupta, Abhijnya Bhat, Venkatesh Babu Radhakrishnan (Poster #235)

Improving Physical Object State Representation in Text-to-Image Generative Systems. Tianle Chen, Chaitanya Chakka, Deepti Ghadiyaram (Poster #236)

Instant Particle Size Distribution Measurement Using CNNs Trained on Synthetic Data. Yasser El Jarida, Youssef Iraqi, Loubna Mekouar (Poster #237)

Investigating the Influence of Image Augmentations on the Sim-to-Real Generalization of Deep Learning Perception Models. Joachim Rüter, Umut Durak, Johann C. Dauer (Poster #238)

Investigating the Scaling Effect of Instruction Templates for Training Multimodal Language Model. Shijian Wang, Linxin Song, Jieyu Zhang, Ryotaro Shimizu, Jiarui Jin, Ao Luo, Yuan Lu, Li Yao, Cunjian Chen, Julian McAuley, Wentao Zhang, Hanqian Wu

Latent Video Dataset Distillation. Ning Li, Antai Andy Liu, Jingran Zhang, Justin Cui (Poster #239)

LATTE: Learning to Reason with Vision Specialists. Zixian Ma, Jianguo Zhang, Zhiwei Liu, Jieyu Zhang, Juntao Tan, Manli Shu, Juan Carlos Niebles, Shelby Heinecke, Huan Wang, Caiming Xiong, Ranjay Krishna, silvio savarese (Poster #240)

Learning 3D Representations from Procedural 3D Programs. Xuweiyi Chen, Zezhou Cheng (Poster #241)

Learning from Synthetic Data for Visual Grounding. Ruozhen He, Ziyan Yang, Paola Cascante-Bonilla, Alexander C. Berg, Vicente Ordonez (Poster #242)

Learning to Blur is Learning to Deblur: Realistic Synthetic UHD Blurred Image via Diffusion. Xin Su, Xiuyi Jia, Chen Wu, Dianjie Lu, Guijuan Zhang, Yang Wen, Zhuoran Zheng (Poster #243)

Leveraging Automatic CAD Annotations for Supervised Learning in 3D Scene Understanding. Yuchen Rao, Stefan Ainetter, Sinisa Stekovic, Vincent Lepetit, Friedrich Fraundorfer (Poster #244)

LFQUIAD: Lookup-Free Quantized autoencoder for few-shot Unsupervised Industrial Anomaly Detection via Synthetic Diffusion Inpainting. SHIH-CHIH LIN, Shang-Hong Lai (Poster #245)

LiveVQA: Assessing Models with Live Visual Knowledge. Mingyang Fu, Yuyang Peng, Benlin Liu, Yao Wan, Dongping Chen (Poster #246)

Minimizing Data, Maximizing Performance: Generative Examples for Continual Task Learning. Mahsa Mozafarinia, Joshua Andle, Santhosh Karnik, Daniel Goldfarb, PAul HAnd, Salimeh Sekeh (Poster #247)

MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World. Ankit Dhiman, Manan Shah, Venkatesh Babu Radhakrishnan (Poster #248)

MoireDB: A Formula-driven Image Dataset for Robustness Enhancement. Yuto Matsuo, Yoshihiro Fukuhara, Yuki M Asano, Hirokatsu Kataoka, Akio Nakamura (Poster #249)

MultiRef: Controllable Image Generation with Multiple Visual References. Ruoxi Chen, Siyuan Wu, Dongping Chen, Shiyun Lang, Petr Sushko, Gaoyang Jiang, Sinan Wang, Yao Wan, Ranjay Krishna (Poster #250)

Not All Samples Should Be Utilized Equally: Towards Understanding and Improving Dataset Distillation. Shaobo Wang, Yantai Yang, Qilong Wang, Kaixin Li, Linfeng Zhang, Junchi Yan (Poster #251)

Point Cloud Segmentation of Agricultural Vehicles using 3D Gaussian Splatting. Alfred T. Christiansen, Andreas H. Højrup, Morten K. Stephansen, Md Ibtihaj Amin, Taman S. Poojary, Filip Slezák, Morten Stigaard Laursen, Thomas B. Moeslund, Joakim Bruslund Haurum (Poster #252)

Quo Vadis Handwritten Text Generation for Handwritten Text Recognition?. Konstantina Nikolaidou, Vittorio Pippi, Silvia Cascianelli, Marcus Liwicki, Rita Cucchiara (Poster #253)

R3ST: A Synthetic Dataset with Real Trajectories for Urban Traffic Analysis. Simone Teglia, Claudia Melis Tonti, Francesco Pro, Leonardo Russo, Andrea Alfarano, Matteo Pentassuglia, Irene Amerini (Poster #254)

RAFT: Robust Augmentation of FeaTures for Image Segmentation. Edward Steven Humes, Xiaomin Lin, Utteja Kallakuri, Tinoosh Mohsenin (Poster #255)

ReSIT: A more Realistic Synthetic Driving Dataset for Multi-Domain Image-to-Image Translation. Joonhyung Park, Yeong-Seok Kim, HeonJeong Chu, Hye-Rin Kim, Seungryong Kim (Poster #256)

RoadSocial: A Diverse VideoQA Dataset and Benchmark for Road Event Understanding from Social Video Narratives. Chirag Parikh, Deepti Rawat, Rakshitha R. T., Tathagata Ghosh, Ravi Kiran Sarvadevabhatla (Poster #257)

Stable Bidirectional Graph Convolutional Networks for Label-Frugal Skeleton-based Recognition. Hichem Sahbi (Poster #258)

STM2PE-Diff : Synthetically Trained Music-to-Pose Encoder Diffusion for Automated Choreography Generation. Nokap Tony Park (Poster #259)

SVI-Paste: Synthetic Dynamic Instance Copy-Paste. Sahir Shrestha, Weihao Li, Gao Zhu, Nick Barnes (Poster #260)

SynSHRP2: A Synthetic Multimodal Benchmark for Driving Safety-critical Events Derived from Real-world Driving Data. Liang Shi, Boyu Jiang, Zhenyuan Yuan, Miguel A. Perez, Feng Guo (Poster #261)

SynTable: A Synthetic Data Generation Pipeline for Unseen Object Amodal Instance Segmentation of Cluttered Tabletop Scenes. Zhili Ng, Haozhe Wang, Zhengshen Zhang, Francis E. H. Tay, Marcelo H Ang Jr (Poster #262)

Synthesizing 3D Abstractions by Inverting Procedural Buildings with Transformers. Maximilian Dax, Jordi Serrano Berbel, Jan Stria, Leonidas Guibas, Urs M Bergmann (Poster #263)

Synthetic Data Generation for Demonstrating Noise Reduction in Facial Depth Imaging. Connah Kendrick, Moi Hoon Yap, Kevin Tan (Poster #264)

Synthetic Generation and Latent Projection Denoising of Rim Lesions in Multiple Sclerosis. Alexandra Grace Roberts, Ha Manh Luu, Mert Sisman, Alexey V. Dimov, Ceren Tozlu, Ilhami Kovanlikaya, Susan Gauthier, Thanh D. Nguyen, Yi Wang (Poster #265)

Synthetic Human Action Video Data Generation with Pose Transfer. Václav Knapp, Maty Bohacek (Poster #266)

Text-guided Synthetic Geometric Augmentation for Zero-shot 3D Understanding. Kohei Torimi, Ryosuke Yamada, Daichi Otsuka, Kensho Hara, Yuki M Asano, Hirokatsu Kataoka, Yoshimitsu Aoki (Poster #267)

You Are Your Best Teacher: Semi-Supervised Surgical Point Tracking with Cycle-Consistent Self-Distillation. Valay Bundele, Mehran Hosseinzadeh, Hendrik Lensch (Poster #268)