Overview
The workshop aims to explore the use of synthetic data in training and evaluating computer vision models, as well as in other related domains. During the last decade, advancements in computer vision were catalyzed by the release of painstakingly curated human-labeled datasets. Recently, people have increasingly resorted to synthetic data as an alternative to labor-intensive human-labeled datasets for its scalability, customizability, and cost-effectiveness. Synthetic data offers the potential to generate large volumes of diverse and high-quality vision data, tailored to specific scenarios and edge cases that are hard to capture in real-world data. However, challenges such as the domain gap between synthetic and real-world data, potential biases in synthetic generation, and ensuring the generalizability of models trained on synthetic data remain. We hope the workshop can provide a forum to discuss and encourage further exploration in these areas.
Invited Speakers
Schedule
- 09:00 – 09:10 Opening Opening
- 09:10 – 09:50 Talk by Ludwig Schmidt Talk
- 09:50 – 10:30 Talk by Ruslan Salakhutdinov Talk
- 10:30 – 10:50 Break Break
- 10:50 – 11:30 Talk by Yale Song Talk
- 11:30 – 12:10 Talk by Jia Deng Talk
- 12:10 – 13:30 Lunch Break
- 13:30 – 14:30 Poster Session Poster
- 14:30 – 15:10 Talk by Ani Kembhavi Talk
- 15:10 – 15:50 Talk by Ming Lin Talk
- 15:50 – 16:10 Break Break
- 16:10 – 16:50 Talk by Yannis Kalantidis Talk
- 16:50 – 17:05 Oral · CinePile: A Long Video Question Answering Dataset and Benchmark Oral
- 17:05 – 17:20 Oral · GenAI-Bench: A Holistic Benchmark for Compositional Text-to-Visual Generation Oral
- 17:20 – 17:30 Closing Closing
Poster Session
-
TimeJune 18 · 1:30 – 2:30 PM
-
LocationArch Building Exhibit Hall
-
Poster Numbers#300 – #349
Notice: the poster session location is different from the talk venue.
Awards
Best Long Paper
Long Paper Honorable Mention
Short Paper Honorable Mention
Accepted Papers · 42 papers
Call for Papers
We invite papers on the use of synthetic data for training and evaluating computer vision models. We welcomed submissions along two tracks:
- Full papers: Up to 8 pages, not including references/appendix.
- Short papers: Up to 4 pages, not including references/appendix.
- Effectiveness: What is the most effective way to generate and leverage synthetic data? How "realistic" does synthetic data need to be?
- Efficiency and scalability: Can we make synthetic data generation more efficient and scalable without sacrificing quality?
- Benchmark and evaluation: What benchmark and evaluation methods are needed to assess the efficacy of synthetic data for computer vision?
- Risks and ethical considerations: What ethical questions and risks are associated with synthetic data (e.g. bias amplification), and how can we address them?
- Applications: In addition to existing attempts on leveraging synthetic data for training visual recognition and vision-language models, what are other tasks in computer vision or other related fields (e.g., robotics, NLP) that could benefit from synthetic data?
- Other open problems: How do we decide which type of data to use, synthetic or real-world data? What is the optimal way to combine both if both are available?
Important Workshop Dates
-
Submission DeadlineMarch 30 · 11:59 PM PT
-
NotificationApril 9 · 11:59 PM PT
-
Camera ReadyApril 24 · 11:59 PM PT
-
Workshop DateJune 18, 2024 · Full day
Related Workshops
- Machine Learning with Synthetic Data @ CVPR 2022
- Synthetic Data for Autonomous Systems @ CVPR 2023
- Synthetic Data Generation with Generative AI @ NeurIPS 2023