Overview
Welcome to The 3rd Workshop on Synthetic Data for Computer Vision (SynData4CV) at CVPR 2026!
During the last decade, advances in computer vision have been catalyzed by the release of meticulously curated human-labeled datasets. Recently, people have increasingly resorted to synthetic data as an alternative to labor-intensive human-labeled datasets for its scalability, customizability, and cost-effectiveness. Synthetic data offers the potential to generate large volumes of diverse and high-quality vision data, tailored to specific scenarios and edge cases that are hard to capture in real-world data. However, challenges such as the domain gap between synthetic and real-world data, potential biases in synthetic generation, and the generalizability of models trained on synthetic data remain.
This workshop aims to provide a forum for discussion and encouragement of further exploration in these areas.
Topics of interest include, but are not limited to:
- Effectiveness: What is the most effective way to generate and leverage synthetic data? Does synthetic data need to "look" realistic?
- Efficiency and scalability: Can we make synthetic data generation more efficient and scalable without much sacrifice on the quality?
- Benchmark and evaluation: What benchmark and evaluation methods are needed to assess the efficacy of synthetic data for computer vision?
- Risks and ethical considerations: How can we mitigate the risks of generating and using synthetic data? How do we address relevant ethical questions, such as bias amplification in synthetic datasets?
- Applications: What are other tasks in computer vision or other related fields (e.g., robotics, NLP) that could benefit from synthetic data?
- Other open problems: How do we decide which type of data to use, synthetic or real-world data? What is the optimal way to combine both if both are available?
Invited Speakers
Schedule
Workshop date: June 4, 2026, Afternoon (Half day)
Location: Room 607
| 1:00 PM - 1:10 PM | Opening Remarks |
| 1:10 PM - 1:45 PM | Invited Talk: Manling Li |
| 1:45 PM - 2:20 PM | Invited Talk: Jia Deng |
| 2:20 PM - 2:55 PM | Invited Talk: Georgia Gkioxari |
| 2:55 PM - 3:10 PM | Break |
| 3:10 PM - 3:45 PM | Invited Talk: Andrew Owens |
| 3:45 PM - 4:20 PM | Invited Talk: Nupur Kumari |
| 4:20 PM - 4:30 PM | Closing Remarks |
| 4:30 PM - 5:30 PM | Poster Session |
Accepted Papers (sorted by alphabet)
| Addressing Data Scarcity in Depth-Based Human Action Recognition via Zero-Shot Depth Estimation. Rebeka Angyal, Pedro Hermosilla, Martin Kampel, Irene Ballester |
| AfriST-VQA: Benchmarking MLLMs for Scene-Text Visual Question Answering for African Languages. Henry Gagnier |
| Appreciate the View: A Task-Aware Evaluation Framework for Novel View Synthesis. Saar Stern, Ido Sobol, Or Litany |
| Assessing the Predictive Value of Physics-Grounded Synthetic Data for Computer Vision in Space Environments. Arianna Issitt, Emily Happy, Elijah Clark, Mackenzie J. Meni, Ryan T. White |
| Auto-Comp: Scalable Controlled Synthetic Benchmarks for VL Compositionality. Cristian Sbrolli, Toshihiko Yamasaki, Matteo Matteucci |
| Avatar4D: Synthesizing Domain-Specific 4D Humans for Real-World Pose Estimation. Jerrin Bright, Zhibo Wang, Dmytro Klepachevskyi, Yuhao Chen, Sirisha Rambhatla, David A. Clausi, John S. Zelek |
| Beyond Objects: Contextual Synthetic Data Generation for Fine-Grained Classification. William Yang, Xindi Wu, Zhiwei Deng, Esin Tureci, Olga Russakovsky |
| Beyond Photorealism: Counterfactual Synthetic Bundles for Invariant Sim-to-Real Vision. Murari Ambati |
| Beyond Raw Signals: Undecoded Generative Latents as Privileged Synthetic Data. Cristian Sbrolli, Nicolas Michel, Matteo Matteucci, Toshihiko Yamasaki |
| Completing Missing Modalities: Synthetic Data for RGB–Infrared–Thermal–Text Person Re-Identification. Muhammad Umair, Muhammad Hammad Musaddiq, Jun Zhou, Ahmad Muhammad |
| CryoDiff: Cryo-EM Synthesis via Biophysics and Cycle-Consistent Diffusion. Genpei Zhang, Yuntian Yang, Siqi Wu, Ningyan Zhang, Seonghui Min, Jie Wu, Christopher Braxton Owens, Minhao Wu, Wanyue Feng, Gus LW Hart, Runmin Jiang, Min Xu |
| Diffusion-Augmented Coreset Expansion for Scalable Dataset Distillation. Ali Abbasi, Shima Imani, Chenyang An, Gayathri Mahalingam, Harsh Shrivastava, Maurice Diesendruck, Hamed Pirsiavash, Pramod Sharma, Soheil Kolouri |
| Disentangled Anatomy-Disease Diffusion (DADD) for Controllable Ulcerative Colitis Progression Synthesis. Umut Dundar, Alptekin Temizel |
| Durian: Dual Reference Image-Guided Portrait Animation with Attribute Transfer. Hyunsoo Cha, Byungjun Kim, Hanbyul Joo |
| Evaluating the Trade-offs of MDL-to-UsdPreviewSurface Material Simplification in NVIDIA Isaac Sim: Visual Quality, Feature Preservation, and AI Task Performance. Zihou Zhu, Mei Haitao, Haolong Zheng, Zhou Zhang |
| Few-Shot Synthetic Data Generation with Diffusion Models for Downstream Vision Tasks. Daniil Dushenev, Nazariy Karpov, Daniil Zinovjev, Alexander Gorin, Konstantin Kulikov |
| Fréchet Inception Distance is Failing to Preserve Rank Consistency for Synthetic Out-of-Distribution Samples. Linghui Liu, Henrike Stephani, Janis Keuper |
| Generating Synthetic Illumination Variation with Co-Located Relighting. Yash Turkar, Karthik K Dantu |
| Grounding Synthetic Data Generation With Vision and Language Models. Ümit Mert Çağlar, Alptekin Temizel |
| How Far Can We Go With Synthetic Data for Audio-Visual Sound Source Localization?. Arda Senocak, Sooyoung Park, Tae-Hyun Oh, Joon Son Chung |
| Hybrid Rendering for Multimodal Autonomous Driving: Merging Neural and Physics-Based Simulation. Máté Tóth, Péter Kovács, Réka Bencses, Zoltan Bendefy, Zoltan Hortsin, Balázs Teréki, Tamas Matuszka |
| iARCS: Iterative Agentic RL for Controllable 3D Scene Generation. Saugat Adhikari, Ashok Prasad Neupane, Pramish Paudel, Ajad Chhatkuli, Danda Pani Paudel |
| Masked Language Prompting for Generative Data Augmentation in Few-shot Fashion Style Recognition. Yuki Hirakawa, Ryotaro Shimizu |
| Multi-Objective Photoreal Simulation (MOPS) Dataset for Computer Vision in Robot Manipulation. Maximilian Xiling Li, Paul Mattes, Nils Blank, Rudolf Lioutikov |
| Narrowing the Performance Gap in Synthetic VLM Pre-training via Multi-Generator Ensembles. Leonardo Brusini, Cristian Sbrolli, Eugenio Lomurno, Toshihiko Yamasaki, Matteo Matteucci |
| Object-Centric Data Synthesis for Category-level Object Detection. Vikhyat Agarwal, Jiayi Cora Guo, Declan Hoban, Sissi Yuxi Zhang, Nick Moran, Peter Cho, Srilakshmi Pattabiraman, Shantanu H. Joshi |
| One Category One Prompt: Dataset Distillation using Diffusion Models. Ali Abbasi, Ashkan Shahbazi, Hamed Pirsiavash, Soheil Kolouri |
| OrbitArch. I-Ting Tsai, Bharath Hariharan |
| Personalized Generative Models for Contextual Debiasing. Xinran Liang, Esin Tureci, Prachi Sinha, Ye Zhu, Vikram V. Ramaswamy, Olga Russakovsky |
| PLLM: Pseudo-Labeling Large Language Models for CAD Program Synthesis. Yuanbo Li, Dule Shu, Yan-Ying Chen, Matthew Klenk, Daniel Ritchie |
| Privacy-Aware Synthetic Video Benchmarking and Relational Evaluation for Worker-Under-Suspended-Load Detection. Anshu Singh, Alejandro Seif |
| ProductConsistency: Improving Product Identity Preservation in Instruction-Based Image Editing via SFT and RL. Mukund Khanna, Raj Singh Yadav, Kunal Singh |
| RareCrafter: Controllable Generative Augmentation for Rare Object Detection in Driving Scenes. Mohadeseh Ghafoori, Danielle Lee, Kurt Hammen, Collin Meese, Mark Nejad |
| Realiz3D: 3D Generation Made Photorealistic via Domain-Aware Learning. Ido Sobol, Kihyuk Sohn, Yoav Blum, Egor Zakharov, Max Bluvstein, Andrea Vedaldi, Or Litany |
| Representation-Conditioned Diffusion Models for Guided Training Data Generation. Nithesh Chandher Karthikeyan, Jonas Unger, Gabriel Eilertsen |
| Restereo: Unifying diffusion stereo video generation and restoration. Xingchang Huang, Ashish Kumar Singh, Florian Dubost, Cristina Nader Vasconcelos, Sakar Khattar, Liang Shi, Christian Theobalt, Cengiz Oztireli, Gurprit Singh |
| SAIL: Similarity-Aware Guidance and Inter-Caption Augmentation-based Learning for Weakly-Supervised Dense Video Captioning. Ye-Chan Kim, SeungJu Cha, Si-Woo Kim, MinJu Jeon, HyunGee Kim, Dong-Jin Kim |
| Scaling Up 3D Forest Vision with Synthetic LiDAR. Yihang She, Andrew Blake, David Coomes, Srinivasan Keshav |
| Sea-Mie: Physically-Based Synthetic Fog for Maritime Image Defogging via Curriculum Learning. Stelios Avlakiotis, Peter Ter Heerdt, Thomas De Kerf, Steve Vanlanduit |
| Sequential Dataset for Satellite Pose Estimation and a Frequency-Space Neural Operator for HIL-Free Generalization Benchmarking. Woojin Cho, Junghwan Park, Steve Andreas Immanuel, JunminPark, Seokhyun Chin, Jiayun Wang |
| Sim-to-Real Metrology: Calibrated Digital Twins for Fringe Projection Profilometry. Noble Austine, Vuppu Eshwar Sai, Vaishnavi Ravi, Madhu S. Nair, Gorthi Rama Krishna Sai Subrahmanyam |
| SJEPA: Joint Embedding Predictive Architecture for Synthetic-to-Real Alignment. Shentong Mo |
| Structure-Consistent Joint Image-Mask Synthesis for Data-Scarce Medical Image Segmentation. Ningyan Zhang, Weiyi Zhang, Mostofa Rafid Uddin, Xingjian Li, Min Xu |
| Structure-retained low-rank adapters for weather synthesis. Shunxin Wang, Alexandros Stergiou, Luuk Spreeuwers, Estefania Talavera, Nicola Strisciuglio |
| StyleText: A Large-Scale Dataset and Benchmark for Stylized Scene Text Inpainting. Aleksandr Simonyan, Nipun Jindal |
| Synthetic Data Generation for Long-Tail Medical Image Classification: A Case Study in Skin Lesions. Jiaxiang Jiang, Mahesh Subedar, Omesh Tickoo |
| Synthetic Designed Experiments for Diagnosing Vision Model Failures. Krisanu Sarkar |
| Theory of Space: Evaluating Active Spatial Belief Construction in Foundation Models with Synthetic 3D Environments. Pingyue Zhang, Zihan Huang, Yue Wang, Jieyu Zhang, Letian Xue, Zihan Wang, Qineng Wang, Keshigeyan Chandrasegaran, Ruohan Zhang, Yejin Choi, Ranjay Krishna, Jiajun Wu, Li Fei-Fei, Manling Li |
| Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision. Hyunsoo Cha, Wonjung Woo, Byungjun Kim, Hanbyul Joo |
| Video-Consistent Synthetic Skiing Trajectories. M'Saydez Campbell, Rémi Emonet, Damien Muselet, Christophe Ducottet |
| WaterGen: Decoupling Scene and Medium in Underwater Image Generation. Jiayi Wu, Tianfu Wang, Tianyi Xiong, Dehao Yuan, Xiaomin Lin, Md Jahidul Islam, Cornelia Fermuller, Christopher Metzler, Yiannis Aloimonos |
| When Does Synthetic Data Help? A Spectral Theory of Task-Relevant Domain Gap with Applications to Guided Generation and Bias Auditing. Kaustubh S. Bukkapatnam, Rayan Malik |
| Why Training with Synthetic Data Fails for OOD: Distribution Gap Amplifies Noise Misalignment in Diffusion Models. Ying Hua, Jessica Bader, Jae Myung Kim, Zeynep Akata |
| WireSeg-32K: A Physics-Grounded Synthetic Dataset for Wire Instance Segmentation. Zilin Dai, Lehong Wang, Yi Yang, Xiang Fei |
Call for Papers
We invite submissions on topics related to synthetic data for computer vision, including but not limited to:
- Novel methods for generating synthetic data
- Techniques for bridging the domain gap between synthetic and real data
- Benchmarks and evaluation metrics for synthetic data
- Applications of synthetic data in various computer vision tasks
- Ethical considerations and bias mitigation in synthetic data generation
- Efficient and scalable synthetic data generation pipelines
Submission Guidelines:
- Papers should be formatted according to the CVPR 2026 template
- Short papers: 4 pages (excluding references); Long papers: 8 pages (excluding references)
- Submissions should be made through OpenReview
- All submissions will be double-blind reviewed
- Accepted papers will NOT be included in CVPR proceedings, so there are no double submission concerns.
Important Workshop Dates
- Submission opens: February 25, 2026
- Submission deadline: March 17, 2026 (11:59 AM UTC)
- Notification of acceptance: TBD
- Camera Ready submission deadline: TBD
- Workshop date: June 4, 2026, Afternoon (Half day), Room 607
Related Workshops
- Synthetic Data for Computer Vision @ CVPR 2025
- Machine Learning with Synthetic Data @ CVPR 2022
- Synthetic Data for Autonomous Systems @ CVPR 2023
- Synthetic Data Generation with Generative AI @ NeurIPS 2023