Multi-Domain Long-Tailed Recognition

On Multi-Domain Long-Tailed Recognition,
Imbalanced Domain Generalization and Beyond

Yuzhe Yang¹ Hao Wang² Dina Katabi¹

¹ MIT CSAIL ² Rutgers University

Abstract

Real-world data often exhibit imbalanced label distributions. Existing studies on data imbalance focus on single-domain settings, i.e., samples are from the same data distribution. However, natural data can originate from distinct domains, where a minority class in one domain could have abundant instances from other domains. We formalize the task of Multi-Domain Long-Tailed Recognition (MDLT), which learns from multi-domain imbalanced data, addresses label imbalance, domain shift, and divergent label distributions across domains, and generalizes to all domain-class pairs. We first develop the domain-class transferability graph, and show that such transferability governs the success of learning in MDLT. We then propose BoDA, a theoretically grounded learning strategy that tracks the upper bound of transferability statistics, and ensures balanced alignment and calibration across imbalanced domain-class distributions. We curate five MDLT benchmarks based on widely-used multi-domain datasets, and compare BoDA to twenty algorithms that span different learning strategies. Extensive and rigorous experiments verify the superior performance of BoDA. Further, as a byproduct, BoDA establishes new state-of-the-art on Domain Generalization benchmarks, highlighting the importance of addressing data imbalance across domains, which can be crucial for improving generalization to unseen domains.

Paper

On Multi-Domain Long-Tailed Recognition, Imbalanced Domain Generalization and Beyond
Yuzhe Yang, Hao Wang, and Dina Katabi
European Conference on Computer Vision (ECCV 2022)
[Paper] • [Poster] • [Code] • [Blog Post] • [BibTeX]

Talk

Links


[Paper]	[Poster]	[Blog Post]	[Code, Data, and Models]

Highlights

(1) New Task: Multi-Domain Long-Tailed Recognition (MDLT).

(2) New Techniques: Domain-Class Transferability Graph & Transferability Statistics.

(3) Intriguing Observations:

Divergent Label Distributions Hamper Transferable Features

Transferability Statistics Characterize Generalization

(4) Theoretical Guarantees: BoDA — A Loss that Bounds the Transferability Statistics

(5) New MDLT Benchmarks & Imbalanced Domain Generalization

Citation

@inproceedings{yang2022multi,
  title={On Multi-Domain Long-Tailed Recognition, Imbalanced Domain Generalization and Beyond},
  author={Yang, Yuzhe and Wang, Hao and Katabi, Dina},
  booktitle={European Conference on Computer Vision (ECCV)},
  year={2022}
}

Related Publications

Delving into Deep Imbalanced Regression
Yuzhe Yang, Kaiwen Zha, Ying-Cong Chen, Hao Wang, and Dina Katabi
International Conference on Machine Learning (ICML 2021), Long Talk
[Project Page] • [Paper] • [Code] • [Blog Post] • [Talk]

Rethinking the Value of Labels for Improving Class-Imbalanced Learning
Yuzhe Yang, and Zhi Xu
Conference on Neural Information Processing Systems (NeurIPS 2020)
[Project Page] • [Paper] • [Code] • [Talk]

On Multi-Domain Long-Tailed Recognition, Imbalanced Domain Generalization and Beyond

Yuzhe Yang1 Hao Wang2 Dina Katabi1