Kaputt: A Large-Scale Dataset for Visual Defect Detection

  • Sebastian Höfer1
  • Dorian Henning1
  • Artemij Amiranashvili1
  • Douglas Morrison1
  • Mariliza Tzes1
  • Ingmar Posner1,2
  • Marc Matvienko1
  • Alessandro Rennola1
  • Anton Milan 1


  • 1   Amazon Logo   FTR Logo               2 University of Oxford               ICCV Logo

Abstract

We present a novel large-scale dataset for defect detection in a logistics setting. Recent work on industrial anomaly detection has primarily focused on manufacturing scenarios with highly controlled poses and a limited number of object categories. Existing benchmarks like MVTec-AD (Bergmann et al., 2021) and VisA (Zou et al., 2022) have reached saturation, with state-of-the-art methods achieving up to 99.9% AUROC scores. In contrast to manufacturing, anomaly detection in retail logistics faces new challenges, particularly in the diversity and variability of object pose and appearance. Leading anomaly detection methods fall short when applied to this new setting. To bridge this gap, we introduce a new benchmark that overcomes the current limitations of existing datasets. With over 230,000 images (and more than 29,000 defective instances), it is 40 times larger than MVTec and contains more than 48,000 distinct objects. To validate the difficulty of the problem, we conduct an extensive evaluation of multiple state-of-the-art anomaly detection methods, demonstrating that they do not surpass 56.96% AUROC on our dataset. Further qualitative analysis confirms that existing methods struggle to leverage normal samples under heavy pose and appearance variation. With our large-scale dataset, we set a new benchmark and encourage future research towards solving this challenging problem in retail logistics anomaly detection. The dataset is available for download under https://www.kaputt-dataset.com.

Citation

Dataset Download

Please fill out the following form to obtain access to the dataset. Once completed, you will receive detailed information on how to download the dataset.

Acknowledgements

We thank our collaborators in Amazon's operations, hardware and software engineering, as well as our annotation teams. Their invaluable contributions to hardware development, software implementation, data collection, and labeling efforts were essential to the success of this work.

Processing your request...

✅ Dataset Download Request Submitted Successfully!

You will shortly receive an email with detailed instructions as to how to access the dataset. Please also make sure to check your spam folder.

For any questions or concerns, please contact us at @amazon.com providing your submission ID.

🔒 Security Verification

Please complete the CAPTCHA below to continue