TailedCore: Few-Shot Sampling for Unsupervised Long-Tail Noisy Anomaly Detection

Northeastern University  AiV. Co  Yonsei University  Mitsubishi Electric Research Laboratories
CVPR2025

*Equal Contribution
MY ALT TEXT

Previous noise discriminative models or few-shot methods based on greedy sampling suffer from noise-versus-tail dillema where the former eliminates tail classes and the latter favors noise features as well as tail class features.

Abstract

We aim to solve unsupervised anomaly detection in a practical challenging environment where the normal dataset is both contaminated with defective regions and its product class distribution is tailed but unknown. We observe that existing models suffer from tail-versus-noise trade-off where if a model is robust against pixel noise, then its performance deteriorates on tail class samples, and vice versa. To mitigate the issue, we handle the tail class and noise samples independently. To this end, we propose TailSampler, a novel class size predictor that estimates the class cardinality of samples based on a symmetric assumption on the class-wise distribution of embedding similarities. \samp can be utilized to sample the tail class samples exclusively, allowing to handle them separately. Based on these facets, we build a memory-based anomaly detection model TailedCore, whose memory both well captures tail class information and is noise-robust. We extensively validate the effectiveness of TailedCore on the unsupervised long-tail noisy anomaly detection setting, and show that TailedCore outperforms the state-of-the-art in most settings.

method

Pipeline of our method




method

Previous methods to solve either noisy dataset or long-tailed dataset suffer from tail-versus-noise trade-off where a noise discriminative model shows low performance on tail class samples while algorithms such as greedy sampling favor tail classes due to its nature also favors noisy features.




method

Experimental results comparing with other baselines on 3 different types of long-tail setups which are Pareto, step K=4, and K=1 where K denotes the number of sample for the class. 60% of the classes are long-tailed and the head classes are all contaminated for all setups.




method

To discover the correlation between accuracy of tail-classes and noisy samples(x-axis) and important metrics(y-axis) which are related with class size prediction and few-shot sampling with step K=4. Correlation is strong for (a) mis-sampling ratio, (b) ratio of missing few-shot samples, (e) class size prediction error, and (f) AUROC for few-shot prediction. Better embeddings improve TailSampler which in turn improves (g) anomaly classification (image-level AUROC) and (h) anomaly segmentation (pixel-level AUROC) performance.




method

Ablation of varying the noise ratio. The performance of PatchCore deteriorates as the ratio of noise increases, while TailedCore and SoftPatch is relatively less sensitive to the noise ratio for every experiments. Another remarkable fact is TailedCore is comparable to PatchCore when the dataset is noise-free while SoftPatch doesn’t work well in this setup.




method

Qualitative evaluation.




Poster

BibTeX

        
        BibTex Code Here