Hugging Face
Hugging Face is an open-source platform and ecosystem for machine learning. It’s best known for the Transformers library, but it has grown into a one-stop shop for finding, sharing, and running models, datasets, and demos across NLP, computer vision, audio, and multimodal tasks. If you’re doing ML research or building an ML application, chances are you’ll interact with Hugging Face at some point.
The platform hosts over 900,000 models and 200,000 datasets contributed by the community, along with tools for training, fine-tuning, evaluation, and deployment. It integrates natively with PyTorch, TensorFlow, and JAX.
Key features
- Model Hub: Browse, search, and download pretrained models for virtually any task — text generation, classification, object detection, speech recognition, and more. Every model has a model card with documentation, usage examples, and performance details.
- Datasets library: Load thousands of community-contributed datasets with a single line of code using the
datasetslibrary. Supports streaming for large datasets that don’t fit in memory. - Pipelines: The
pipeline()API provides a high-level interface for common tasks (sentiment analysis, summarization, image classification, etc.) — often just 2-3 lines of code to go from zero to inference. - Spaces: Host interactive ML demos using Gradio or Streamlit directly on Hugging Face. Great for sharing prototypes, class projects, or research demos without managing infrastructure.
- Fine-tuning and training: The
TrainerAPI and integration with PEFT (parameter-efficient fine-tuning) and TRL (reinforcement learning from human feedback) make it straightforward to adapt models to your data.
Finding and downloading models
The Model Hub is the main entry point. You can filter by task, library, language, license, and more. Each model page includes:
- A model card describing the architecture, training data, intended use, and limitations
- Usage snippets you can copy directly into your code
- Community discussion and version history
To download and use a model in Python:
from transformers import pipeline
# High-level: use a pipeline for common tasks
classifier = pipeline("sentiment-analysis")
result = classifier("Hugging Face makes ML accessible!")
# Or load a specific model and tokenizer
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")For non-Transformers models (e.g., scikit-learn, spaCy, or custom frameworks), the huggingface_hub library provides download/upload utilities:
from huggingface_hub import hf_hub_download
# Download a specific file from any repo
path = hf_hub_download(repo_id="username/model-name", filename="model.pkl")Useful functions and integrations
| Function / Tool | What it does |
|---|---|
pipeline() |
One-liner inference for 30+ task types |
AutoModel / AutoTokenizer |
Auto-detect and load the right model class |
datasets.load_dataset() |
Load any Hub dataset with streaming support |
Trainer |
Full training loop with logging, evaluation, checkpointing |
PEFT (LoRA, QLoRA) |
Fine-tune large models on limited hardware |
accelerate |
Distribute training across GPUs with minimal code changes |
huggingface_hub |
Upload/download models, datasets, and files programmatically |
evaluate |
Standardized metrics (accuracy, BLEU, ROUGE, etc.) |
Hugging Face also integrates with:
- PyTorch, TensorFlow, JAX — Transformers models work across all three
- LangChain — use HF models as components in LLM chains and RAG pipelines
- Sentence Transformers — specialized library for text embeddings, built on HF
- GGUF / llama.cpp — quantized model formats hosted on the Hub for local inference
- Weights & Biases, MLflow — experiment tracking integrations
Model hosting and inference
Hugging Face offers several options for running models beyond your local machine:
- Inference API: Send HTTP requests to hosted models for quick prototyping — no GPU required on your end. Free tier available for many models.
- Inference Endpoints: Deploy dedicated, scalable endpoints for production workloads. You choose the hardware (CPU, GPU, etc.) and pay for what you use.
- Spaces: Host interactive demos powered by Gradio or Streamlit. Useful for sharing results with collaborators or embedding demos in papers and presentations.
UW-Madison on Hugging Face
UW-Madison has a verified organization on Hugging Face: huggingface.co/uw-madison. It hosts models developed by UW researchers, including efficient transformer architectures like YOSO and Nystromformer. If your lab is publishing models, consider hosting them under this org for visibility and discoverability. There are also department-level orgs like UW-Madison-Radiology.
Installation
# Core library
pip install transformers
# Common companion libraries
pip install datasets evaluate accelerate huggingface_hub
# For parameter-efficient fine-tuning
pip install peft trlQuestions?
If you have any lingering questions about this resource, please feel free to post to the Nexus Q&A on GitHub. We will improve materials on this website as additional questions come in.