I organized and taught a workshop on “Reproducibility and Open Science Practices in Machine Learning” for the AI-DOC doctoral program at Aalto University.
The workshop addressed the fundamental question: what is reproducibility and why do we need to care about it in machine learning research?
Topics covered:
- Environment setup with containers and package management
- Reproducible Jupyter notebooks
- Experiment tracking with MLflow
- Modular training pipelines with PyTorch Lightning
- Model sharing via Hugging Face
- Distributed training and multi-GPU optimization
- Overfitting prevention
Collaborators: Simo Tuomisto, Hossein Firooz, Luca Ferranti
Resources: