Categories

All (29)

Agentic coding (1)

AIF360 (1)

Audio data (2)

AWS (2)

Azure (1)

Bedrock (1)

Benchmarking (2)

Bias (1)

Biodiversity (1)

Biology (4)

Boosting (1)

Camera trap (1)

Carpentries (2)

Citizen science (2)

CLIP (2)

Cloud (3)

CNN (3)

Code-along (4)

Compute (7)

Computer vision (12)

Data (6)

Decision trees (1)

Deep learning (10)

Ecology (5)

Fairness (1)

Foundation models (4)

GCP (2)

Gemini (2)

GenAI (4)

Geospatial data (1)

Google (2)

GPU (7)

HPC (1)

HTC (1)

Hugging Face (5)

Image classification (1)

Image data (5)

Image segmentation (3)

Jupyter (2)

Libraries (8)

LLM (8)

LoFTR (1)

LSTM (1)

Medical imaging (2)

Microsoft Copilot (1)

MLOps (1)

Model exploration (4)

Model sharing (1)

Models (4)

Multilingual (1)

Multimodal data (1)

Multimodal learning (3)

Music transcription (1)

NLP (7)

Notebooks (1)

Object detection (2)

OCR (1)

OOD detection (1)

PyTorch (6)

PyTorch-OOD (1)

Quantization (1)

RAG (3)

Remote sensing (1)

Reproducibility (1)

Retrieval (3)

RT-DETR (1)

SageMaker (1)

SAM (1)

Signal processing (1)

Sklearn (2)

Summarization (1)

Tabular (3)

Tensorflow (1)

Text analysis (3)

Trustworthy AI (2)

UW-Madison (4)

ViT (2)

Workshops (2)

YOLO (1)

Zero-shot learning (2)

Toolbox

Explore popular pretrained & foundation models, useful scripts/libraries, and datasets that you can leverage for your next ML project. Learn about their features, how to use them effectively, and see examples of them in action.

Understanding Quantization and Precision

Explore quantization and floating-point precision in deep learning — covering FP32, FP16, BF16, INT8, and 4-bit formats and their impact on GPU memory and inference speed.

Intro to GCP for Machine Learning & AI

This Intro to GCP workshop teaches core workflows for building, training, and tuning ML/AI models in Google Cloud’s Vertex AI platform. Participants learn to set up data…

UW-Madison Cloud Services (AWS, GCP, Azure)

UW-Madison offers enterprise cloud computing through contracts with Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure. These services are managed…

SWE-bench: Evaluating AI on Real-World Software Engineering

SWE-bench is a benchmark designed to evaluate whether AI models can solve real-world software engineering tasks. Rather than testing code generation in isolation, SWE-bench…

OpenScholar: Scientific Literature Synthesis with Retrieval-Augmented LMs

Foundation models

OpenScholar is an open-source, retrieval-augmented language model (LM) designed to help researchers navigate and synthesize scientific literature. Developed by the Allen…

Intro to AWS SageMaker for Predictive ML/AI

This introductory AWS SageMaker workshop teaches core workflows for running predictive ML/AI models in AWS SageMaker, an AWS-managed machine learning environment.…

Google Colab

Google Colab is a cloud-based Jupyter notebook environment that runs entirely in the browser. It allows you to write and execute Python code without installing anything…

BadgerCompute

BadgerCompute is UW–Madison’s browser-based interactive computing service built on JupyterHub. It provides on-demand access to CPUs, memory, and GPUs without requiring any…

BioTrove

Computer vision

Citizen science

A curated dataset of 161M+ images across 324K species for biodiversity AI, built from iNaturalist research-grade observations.

TorchAudio

Music transcription

Signal processing

The TorchAudio library is an audio library that allows you to incorporate modern audio signal processing into deep learning workflows. Developed by the PyTorch team, it…

MegaDetector: Pre-Trained Animal Object Detection for Camera Trap Images

Computer vision

Object detection

MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is…

MLflow 101: Local Experiment Tracking for Beginners

Reproducibility

This is a beginner-friendly overview of local experiment tracking with MLflow. If you’ve ever lost a “good” result because you forgot which settings you used, MLflow solves…

Pytesseract: OCR with Tesseract (LSTM) in Python

Computer vision

Pytesseract is a Python wrapper for Google’s Tesseract Optical Character Recognition (OCR) engine, used for recognizing and extracting text from images. It works on a wide…

DeepForest: A Toolkit of Models for Tree and Wildlife Detection in Aerial Imagery

Model exploration

Object detection

Geospatial data

Computer vision

DeepForest is an open-source Python library for object detection in aerial RGB imagery, designed for ecological applications such as identifying tree crowns (the visible…

iNaturalist (iNat)

Computer vision

Multimodal learning

Citizen science

Zero-shot learning

A massive citizen-science biodiversity dataset with millions of species photos, rich spatial/temporal metadata, and fine-grained labels for computer vision research.

INQUIRE

Computer vision

Zero-shot learning

Multimodal learning

A benchmark for evaluating text-to-image retrieval on expert-level ecological queries, built from iNaturalist data with domain-specific relevance judgments.

CIFAR Dataset

Computer vision

Image classification

A classic image classification benchmark of 60,000 32x32 color images across 10 classes, widely used for evaluating computer vision models.

Aidan O’Brien

UW Generative AI Services & Policies

Microsoft Copilot

Foundation models

The University of Wisconsin–‍Madison is committed to responsibly harnessing the power of generative artificial intelligence to enhance teaching, learning, research and…

NotebookLM: A GenAI Summarization Tool

Foundation models

NotebookLM is a generative AI tool designed to assist with understanding dense, complex texts, including research papers in machine learning and AI. Developed by Google Labs…

AI Fairness 360 (AIF360)

Computer vision

AI Fairness 360 (AIF360) is a scikit-learn-compatible open-source Python library designed to detect and mitigate bias in machine learning models. Its compatibility with…

PyTorch-OOD

Computer vision

PyTorch-OOD is an open-source Python library designed for out-of-distribution (OOD) detection and related tasks like anomaly detection, novelty detection, and open-set…

Kornia

Model exploration

Computer vision

Image segmentation

Kornia is a differentiable library that allows classical computer vision to be integrated into deep learning models. Developed by E. Riba, D. Mishkin, D. Ponsa, E. Rublee…

Leaf Vein Dataset (LVD2021)

Image segmentation

Computer vision

High-resolution leaf images with pixel-wise vein annotations across 36 leaf types, designed for segmentation and plant phenotyping research.

XGBoost: Tree-Based Gradient Boosting for Tabular Data

XGBoost, or eXtreme Gradient Boosting, is a machine learning algorithm built upon the foundation of decision trees, extending their power through boosting. Originally…

Project Gutenberg: Text & Audio Books

Multimodal data

A large open collection of free eBooks and audiobooks useful for NLP tasks like language modeling, text classification, and speech synthesis.

U-Net: Convolutional Networks for Biomedical Image Segmentation

Medical imaging

Image segmentation

U-Net is a convolutional neural network architecture designed for biomedical image segmentation. Introduced in 2015 by Ronneberger and colleagues in the paper, “U-Net…

MONAI: Medical Open Network for AI

Medical imaging

Model exploration

MONAI (Medical Open Network for AI) is an open-source, community-supported framework for deep learning in healthcare imaging. Built on top of PyTorch, it provides…

Hugging Face

Foundation models

Computer vision

Model exploration

Multimodal learning

Hugging Face is an open-source platform and ecosystem for machine learning. It’s best known for the Transformers library, but it has grown into a one-stop shop for finding…

Center for High Throughput Computing (CHTC)

Established in 2006, the Center for High Throughput Computing (CHTC) is UW-Madison’s core computational service provider for large-scale computing. CHTC offers two main…