Trustworthy AI - Explainability, Bias, Fairness, and Safety
About this resource
The Trustworthy AI: Explainability, Bias, Fairness, and Safety workshop equips participants with practical skills to evaluate and improve the trustworthiness of machine learning models. Spanning structured data (tabular), natural language processing (NLP), and computer vision applications, this workshop integrates tools and techniques to enhance fairness, explainability, and reliability. Participants will explore real-world applications using cutting-edge libraries like AIF360, PyTorch-OOD, and GradCAM.
The lesson is now in the alpha phase, meaning it has been piloted and is ready for broader use and feedback. Materials may continue to evolve with additional topics and exercises.
💡 Have ideas for other topics or tools to include? We welcome suggestions! Please post an issue on the lesson’s GitHub repository.
Topics covered
- Fairness: Hands-on bias detection and mitigation using AIF360.
- Explainability: Techniques such as GradCAM and linear probes for visualizing model behavior.
- Safety and Reliability: Out-of-distribution (OOD) detection with PyTorch-OOD.
- Trustworthiness: Ensuring models are interpretable, reproducible, and aligned with ethical practices.
Prerequisites
Learners should have:
- Python programming experience.
- Familiarity with machine learning concepts like train/test splits and cross-validation.
- Exposure to neural networks and basic model training.
Estimated time to complete
This workshop takes approximately 12 hours, divided across multiple hands-on episodes.
Register to take this workshop in Madison!
This workshop is part of UW-Madison’s local Carpentries community efforts to develop advanced ML/AI educational materials. To stay updated about upcoming workshops, subscribe to the Data Science @ UW Newsletter.
Alternatively, work through the materials independently!
The full lesson is available as open-source materials. Visit the lesson materials to explore on your own. If you’re at UW-Madison, join Coding Meetup (Tue/Thur, 2:30-4:30pm) for assistance.
Questions?
If you have questions about this workshop, visit the Nexus Q&A on GitHub. Feedback is welcome and will help improve future iterations of the workshop.
See also
- Library: AIF360: Learn about fairness metrics and bias mitigation strategies for ML.
- Library: PyTorch-OOD: Tools for detecting out-of-distribution data and anomalies.
- Workshop: Intro to Machine Learning with Sklearn: Once you master Python fundamentals, start using the scikit-learn package to begin exploring “classical” ML methods (e.g., regression, clustering, decision trees).
- Workshop: Intro to Deep Learning with PyTorch: Dive into PyTorch as an alternative deep learning framework.
- Talk: Intro to Out-of-Distribution Detection: Learn more about the pervasive problem of out-of-distribution data, and techniques available to mitigate this problem.
- Talk: Trustworthy LLMs & Ethical AI: Learn how DeTox can be used to remove toxic embeddings in LLMs.