MozzaVID dataset quickstart examples

A set of functions enabling a quick start in training models on the MozzaVID dataset, as well as evaluation of the performance of models reported in the dataset paper.

[Data] [Paper] [Project website]

Prerequisites

Complete set of required packages can be installed through the requirements file:

pip install -r requirements.txt

File requirements_with_versions.txt specfies exact versions of the packages, can be used if found relevant.

Pytorch is commented out in both requirement files, since it may require a system-specific installation.

Data

We provide two sources of data:

Complete "raw" data [LINK]:

To use it, you need to download and unzip the dataset locally, then adjust the path to the data in evaluate_model.py and train_model.py. Notice that the Big dataset requires over 300 GB of storage.
HuggingFace WebDatasets [Small split] [Base split] [Large split]:

This setup enables continuous streaming of data during training and evaluation, and requires little-to-no storage space. Continuous internet access is required. Suggested data loading setup is provided in utils_stream.py. To use it, change to DATA_MODE='stream' in evaluate_model.py and train_model.py

Models

Model checkpoints used in the paper can be downloaded from here or here.

The paths to models has to be adjusted in the evaluate_model.py and train_model.py

Example code

A simple model training can be run with the train_model.py script. Existing models can be evaluated with evaluate_model.py. Both files contain a list of hyperparameters that allow exploring all the variations of the dataset.

Reference

If you use our dataset, or any of this code for academic work, please consider citing our publication.

@misc{pieta2024b,
      title={MozzaVID: Mozzarella Volumetric Image Dataset}, 
      author={Pawel Tomasz Pieta and Peter Winkel Rasmussen and Anders Bjorholm Dahl and Jeppe Revall Frisvad and Siavash Arjomand Bigdeli and Carsten Gundlach and Anders Nymark Christensen},
      year={2024},
      howpublished={arXiv:2412.04880 [cs.CV]},
      eprint={2412.04880},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2412.04880}, 
}

License

MIT License (see LICENSE file).

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
requirements_with_versions.txt		requirements_with_versions.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MozzaVID dataset quickstart examples

[Data] [Paper] [Project website]

Prerequisites

Data

Models

Example code

Reference

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

License

PaPieta/MozzaVID

Folders and files

Latest commit

History

Repository files navigation

MozzaVID dataset quickstart examples

[Data] [Paper] [Project website]

Prerequisites

Data

Models

Example code

Reference

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages