ML+X Nexus: Crowdsourced ML and AI Resources

More than just a repository of content from across the internet, ML+X Nexus is a curated, community-driven platform that captures the collective knowledge and experiences of ML+X (and the broader UW campus). It serves as a growing knowledge hub—preserving past discussions, highlighting widely used tools and datasets, and evolving alongside our community’s needs.

What kinds of resources are hosted on Nexus?

Any content (original or external) that can help make the practice of ML/AI more connected, accessible, efficient, and reproducible is welcome on the Nexus platform! This includes, but is not limited to…

🧠 Educational materials: Explore a library of educational materials (workshops, books, videos, etc.) covering a wide range of ML-related topics, tools, and workflows, from foundational concepts to advanced techniques.
🛠 Models, code, and more: Learn about popular models, tools, and datasets that you can leverage for your next ML project.
🧬 Applications & stories: Discover a curated collection of blogs, papers, and talks which dive into real-world ML applications and lessons learned by practitioners.

Disclaimer: The crowdsourced resources on this website are not endorsed by the UW-Madison and have not been vetted by the Division of Information Technology.

Explore resources

To narrow down your search, select one of the general category groupings from the left sidebar (e.g., Learn, Applications, Toolbox). You can also select one of the category tags on the right sidebar to filter resources. Visit the Category glossary if you are unsure about the meaning of any of these tags.

GeoDeepDive: Unlocking Knowledge from Scientific Literature

Scientific Text Mining

Retrieval

A guide to using GeoDeepDive, a powerful platform and database for large-scale text and data mining of scientific documents.

Devanshi Jain

2025-08-21

DeepForest: A Toolkit of Models for Tree and Wildlife Detection in Aerial Imagery

DeepForest is an open-source Python library for object detection in aerial RGB imagery, designed for ecological applications such as identifying tree crowns (the visible…

Chris Endemann

2025-04-04

iNaturalist (iNat)

iNaturalist (iNat) is one of the largest community-contributed biodiversity datasets, with millions of photos of plants, animals, and fungi submitted by citizen scientists…

Chris Endemann

2025-04-03

Harnessing the Power of Foundation Models for Healthcare and Life Sciences

Jameson Merkow, AI engineer at Microsoft, presents an overview of recent advances in applying foundation models to healthcare and life sciences. This forum explores the use…

Jameson Merkow

2025-04-01

INQUIRE

INQUIRE is a benchmark for evaluating text-to-image retrieval models on expert-level ecological queries. Built on a 5-million-image subset of iNaturalist 2024 (iNat24), it…

Chris Endemann

2025-03-26

Automating Scientific Discovery: From Natural World Data to Systematic Literature Reviews

Edward Vendrow, a PhD student at MIT, presents his research on automating scientific discovery using multimodal AI. In this talk, he explores how AI can accelerate research…

Edward Vendrow

2025-03-04

CIFAR Dataset

CIFAR-10 (Canadian Institute for Advanced Research) is a widely used dataset consisting of 60,000 color images at a resolution of 32×32 pixels, spanning 10 distinct classes…

Aidan O’Brien

2025-03-03

Learning Through Comparison: Use Cases of Contrastive Learning

Representation learning

CIFAR

Contrastive learning is reshaping how models learn, driving widespread progress in feature learning, clustering, out-of-distribution detection, and multimodal applications.…

Yin Li, Chris Endemann

2025-02-10

AlphaFold and Protein Language Models

Protein language models

In a talk hosted by the University of Wisconsin-Madison’s Computational Biology, Ecology, and Evolution (ComBEE) community, Dr. Hannah Wayment-Steele discusses AlphaFold, an…

Hannah Wayment-Steele

2025-01-28

UW Generative AI Services & Policies

The University of Wisconsin–‍Madison is committed to responsibly harnessing the power of generative artificial intelligence to enhance teaching, learning, research and…

Chris Endemann

2025-01-07

Introduction to Statistical Learning

Unsupervised Learning

Code-along

First-steps

An Introduction to Statistical Learning (ISL) is a foundational book for anyone interested in statistical learning and its applications in data analysis. Authored by Gareth…

Chris Endemann

2025-01-06

One Useful Thing

Blogs

Industry applications

LLM

Ethical AI

GenAI

One Useful Thing is a blog where Wharton professor Ethan Mollick “provides a research-based view on the implications of AI.” Since November 2022 Mollick has been sharing a…

Alan Ng

2024-12-12

NotebookLM: A GenAI Summarization Tool

NotebookLM is a generative AI tool designed to assist with understanding dense, complex texts, including research papers in machine learning and AI. Developed by Google Labs…

Chris Endemann

2024-12-09

Trustworthy AI - Explainability, Bias, Fairness, and Safety

The Trustworthy AI: Explainability, Bias, Fairness, and Safety workshop equips participants with practical skills to evaluate and improve the trustworthiness of machine…

Chris Endemann

2024-12-03

AI Fairness 360 (AIF360)

AI Fairness 360 (AIF360) is a scikit-learn-compatible open-source Python library designed to detect and mitigate bias in machine learning models. Its compatibility with…

Chris Endemann

2024-12-03

PyTorch-OOD

PyTorch-OOD is an open-source Python library designed for out-of-distribution (OOD) detection and related tasks like anomaly detection, novelty detection, and open-set…

Chris Endemann

2024-12-03

Intro to AWS SageMaker for Predictive ML/AI

This introductory AWS SageMaker workshop teaches core workflows for running predictive ML/AI models in AWS SageMaker, an AWS-managed machine learning environment.…

Chris Endemann

2024-11-07

AI for Music and the Humanities

What Tune Is That: A Humanities Application of Deep Learning — Alan Ng
Fake Artists, Fake Listeners: AI and the Music Industries — Jeremy Morris, PhD

Alan Ng, Jeremy Morris

2024-11-05

Kornia

Kornia is a differentiable library that allows classical computer vision to be integrated into deep learning models. Developed by E. Riba, D. Mishkin, D. Ponsa, E. Rublee…

Radi Akbar

2024-10-29

Leaf Vein Dataset (LVD2021)

The Leaf Vein Dataset 2021 (LVD2021), introduced in the paper Leaf vein segmentation with self-supervision, contains high-resolution images of leaves with pixel-wise…

Chris Endemann

2024-10-18

XGBoost: Tree-Based Gradient Boosting for Tabular Data

XGBoost, or eXtreme Gradient Boosting, is a machine learning algorithm built upon the foundation of decision trees, extending their power through boosting. Originally…

Chris Endemann

2024-10-17

Project Gutenberg: Text & Audio Books

The Project Gutenberg dataset contains text from thousands of books, spanning a variety of genres and styles, and in some cases, corresponding audiobooks. Researchers and…

Chris Endemann

2024-10-14

Using Electronic Health Record Data to Predict Deterioriation in Hospitalized Children

In this talk from the Machine Learning for Medical Imaging (ML4MI) community, Anoop Mayampurath (PhD) discusses the use of electronic health record (EHR) data and machine…

Anoop Mayampurath

2024-10-14

Exploring the Titanic Dataset

The Titanic dataset is a well-known dataset that contains information about the passengers of the Titanic ship.It includes variables such as age, gender, class, fare, and…

Chris Endemann

2024-10-07

U-Net: Convolutional Networks for Biomedical Image Segmentation

U-Net is a convolutional neural network architecture designed for biomedical image segmentation. Introduced in 2015 by Ronneberger and colleagues in the paper, “U-Net…

Chris Endemann

2024-09-16

Vision, Language, and Vision-Language Modeling in Radiology

In this talk from the Machine Learning for Medical Imaging (ML4MI) community, Tyler Bradshaw (PhD) discusses the historical context (e.g., CNN, VGG) leading up to the new…

Tyler Bradshaw

2024-09-16

What Tune Is That? A Humanities Application of Deep Learning

Deep learning (neural network training) can solve humanities challenges, too! Read about a successful project that trained a model to be able to identify Irish traditional…

Alan Ng

2024-09-11

MONAI: Medical Open Network for AI

MONAI (Medical Open Network for AI) is an open-source, community-supported framework for deep learning in healthcare imaging. Built on top of PyTorch, it provides…

Alan McMillan

2024-08-14

Grokking

The verb, “to grok”, was originally coined by Robert A. Heinlein in his 1961 science fiction novel “Stranger in a Strange Land,” where it meant to understand something so…

Chris Endemann

2024-07-26

Intro to Python

The Plotting and Programming in Python workshop provides an introduction to programming in Python 3 for people with little or no previous programming experience. It uses…

Chris Endemann

2024-07-18

Intro to Machine Learning with Sklearn

The Intro to Machine Learning with Sklearn workshop will walk you through introductory machine learning concepts as well as how to implement common ML methods (e.g.…

Chris Endemann

2024-07-17

Intro to Deep Learning with Keras

The Intro to Deep Learning with Keras workshop will walk you through introductory deep learning concepts as well as how to build a neural networks in Keras. Keras is…

Chris Endemann

2024-07-16

Intro to Deep Learning with PyTorch

The Intro to Deep Learning with PyTorch workshop from Udacity will walk you through introductory deep learning concepts as well as how to build a neural networks in PyTorch.…

Chris Endemann

2024-07-15

Understanding Deep Learning

Nowadays, nearly anyone can implement a deep learning model in a just a few lines of code. What separates the novices from the experts, however, is the ability to understand…

Chris Endemann

2024-07-14

Intro to Text Analysis / NLP

The Intro to Text Analysis workshop introduces the field of Natural Language Processing (NLP) and how to gain insights from collections of text data (i.e., a corpus). This…

Chris Endemann

2024-07-13

Overview of Reproducibility Lecture

Videos

Reproducibility

Dr. Sarah Stevens’ lecture highlights the critical importance of reproducibility in computational and data science projects. She also shares best practices to ensure…

Chris Endemann

2024-07-12

Version Control with GitHub Desktop

Navigating the world of version control systems like Git can initially feel daunting, especially for those new to programming or collaborative software development projects.…

Chris Endemann

2024-07-11

Version Control with Git and GitHub (Carpentries)

The below video (and the 7 subsequent videos in the workshop playlist) will walk you through this introductory Git workshop from the Carpentries: Version Control with…

Chris Endemann

2024-07-11

Intro to Out-of-Distribution Detection

The below tutorial from Sharon Li, an Assistant Professor in the Department of Computer Sciences at the University of Wisconsin-Madison, introduces a pervasive problem faced…

Sharon Li

2024-07-11

Center for High Throughput Computing (CHTC)

Compute

GPU

CHTC

Established in 2006, the Center for High Throughput Computing (CHTC) is committed to democratizing access to powerful computing resources across all research domains. From…

Chris Endemann

2024-06-25

How to Contribute?

Guides

Contribute

We want Nexus to serve also as a place where members of the community can share their knowledge. This guide answers the question, how to contribute to Nexus?

ML+X

2024-06-24

A Biophysics-based Protein Language Model for Protein Engineering

Protein language models

We introduce Mutational Effect Transfer Learning (METL), a specialized protein language model that bridges the gap between traditional biophysics-based and machine learning…

Sam Gelman

2024-06-18

Using Large Language Models for Meteorological Fact Finding

This talk demonstrates harnessing the power of AI to open new avenues in data analysis, including for meteorological fact-finding. Discover how cutting-edge large language…

Zekai Otles

2024-05-30