PyTorch-OOD
About this library
PyTorch-OOD is an open-source Python library designed for out-of-distribution (OOD) detection and related tasks like anomaly detection, novelty detection, and open-set recognition. It provides a flexible framework for training, testing, and benchmarking OOD detection methods with PyTorch models. With a standardized workflow, the library enables reproducible research and easy integration into existing deep learning pipelines.
PyTorch-OOD supports a range of detectors, datasets, and benchmarks, allowing researchers to experiment with various OOD methods across multiple domains, including image classification, segmentation, and text classification.
Key features
- Comprehensive detector support:
- Probability-based: Maximum Softmax Probability (MSP), Monte Carlo Dropout (MCD).
- Logit-based: Energy-Based OOD detection (EBO), Temperature Scaling.
- Feature-based: Mahalanobis Distance (MD), k-Nearest Neighbors (kNN).
- Activation pruning: ReAct, Activation Shaping (ASH).
- Dataset integration: Auto-downloadable access to popular OOD datasets (e.g., CIFAR10-C, ImageNet-O, iNaturalist).
- Benchmarking tools: Evaluate OOD detectors on standardized benchmarks for classification, segmentation, and text data.
- Unified API: Common interface for all detectors with
fit
andpredict
methods, ensuring consistency and ease of use. - Extensibility: Modular design allows adding new datasets, detectors, and workflows.
Integration and compatibility
PyTorch-OOD is fully compatible with PyTorch models and workflows. It provides a common interface for initializing and evaluating OOD detectors, making it a seamless addition to deep learning projects.
- Frameworks supported: PyTorch
- Installation instructions:
For basic installation:
pip install pytorch-ood
For the latest development branch:
pip install git+ssh://git@github.com/kkirchheim/pytorch-ood.git@dev
Extending PyTorch-OOD for various domains
PyTorch-OOD’s modular design and dataset integration make it versatile across different domains:
Computer vision: OOD detection for image classification and segmentation tasks using datasets like TinyImageNet and LSUN. Feature-based methods like Mahalanobis Distance can enhance robustness.
Natural language processing (NLP): Evaluate text models for OOD detection using datasets like 20 Newsgroups and Reuters. Text embeddings and logit-based detectors (e.g., Temperature Scaling) can be used for effective OOD classification.
Tabular data: Simplify OOD detection in structured datasets by leveraging feature-based or probability-based detectors.
Why use PyTorch-OOD?
- Standardized workflows: Provides a clear three-step process: train a model, initialize an OOD detector, and evaluate performance.
- Reproducibility: Benchmark datasets and unified APIs ensure reproducible research.
- Versatility: Applicable across multiple domains, including computer vision, NLP, and tabular data.
- Educational value: An excellent resource for teaching OOD concepts and methods in machine learning.
Use cases
- Healthcare: Detecting anomalies in medical images.
- Finance: Identifying novel fraud patterns in transaction datasets.
- Autonomous driving: Segmenting unexpected obstacles in road scenarios.
- NLP: Evaluating robustness of sentiment analysis models to out-of-vocabulary inputs.
- Many others: Use the “Improve this page” button to add a use-case!
Tutorials and resources
- Getting started guide: PyTorch-OOD Documentation
- Example notebooks: Explore examples for detectors, benchmarks, and datasets here.
Questions?
For more assistance, visit the GitHub repository or join the community for discussions and support.
See also
- Talk: Intro to Out-of-Distribution Detection: Learn more about the pervasive problem of out-of-distribution data, and techniques available to mitigate this problem.
- Talk: Trustworthy LLMs & Ethical AI: Learn how DeTox can be used to remove toxic embeddings in LLMs.
- Workshop: Trustworthy AI: Explainability, Bias, Fairness, and Safety: A beginner-friendly workshop on Trustworthy AI/ML concepts and mitigation tools, including AIF360, OOD detection, and explainability methods.
- Library: AI Fairness 360 (AIF360): AI Fairness 360 (AIF360) is a scikit-learn-compatible open-source Python library designed to detect and mitigate bias in machine learning models