Hugging Face

Libraries
Hugging Face
NLP
Deep learning
Foundation models
LLM
Computer vision
Model exploration
Model sharing
PyTorch
Tensorflow
Multimodal learning
Author

Chris Endemann

Published

August 1, 2024

Hugging Face is an open-source platform and ecosystem for machine learning. It’s best known for the Transformers library, but it has grown into a one-stop shop for finding, sharing, and running models, datasets, and demos across NLP, computer vision, audio, and multimodal tasks. If you’re doing ML research or building an ML application, chances are you’ll interact with Hugging Face at some point.

The platform hosts over 900,000 models and 200,000 datasets contributed by the community, along with tools for training, fine-tuning, evaluation, and deployment. It integrates natively with PyTorch, TensorFlow, and JAX.

Key features

  • Model Hub: Browse, search, and download pretrained models for virtually any task — text generation, classification, object detection, speech recognition, and more. Every model has a model card with documentation, usage examples, and performance details.
  • Datasets library: Load thousands of community-contributed datasets with a single line of code using the datasets library. Supports streaming for large datasets that don’t fit in memory.
  • Pipelines: The pipeline() API provides a high-level interface for common tasks (sentiment analysis, summarization, image classification, etc.) — often just 2-3 lines of code to go from zero to inference.
  • Spaces: Host interactive ML demos using Gradio or Streamlit directly on Hugging Face. Great for sharing prototypes, class projects, or research demos without managing infrastructure.
  • Fine-tuning and training: The Trainer API and integration with PEFT (parameter-efficient fine-tuning) and TRL (reinforcement learning from human feedback) make it straightforward to adapt models to your data.

Finding and downloading models

The Model Hub is the main entry point. You can filter by task, library, language, license, and more. Each model page includes:

  • A model card describing the architecture, training data, intended use, and limitations
  • Usage snippets you can copy directly into your code
  • Community discussion and version history

To download and use a model in Python:

from transformers import pipeline

# High-level: use a pipeline for common tasks
classifier = pipeline("sentiment-analysis")
result = classifier("Hugging Face makes ML accessible!")

# Or load a specific model and tokenizer
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")

For non-Transformers models (e.g., scikit-learn, spaCy, or custom frameworks), the huggingface_hub library provides download/upload utilities:

from huggingface_hub import hf_hub_download

# Download a specific file from any repo
path = hf_hub_download(repo_id="username/model-name", filename="model.pkl")

Useful functions and integrations

Function / Tool What it does
pipeline() One-liner inference for 30+ task types
AutoModel / AutoTokenizer Auto-detect and load the right model class
datasets.load_dataset() Load any Hub dataset with streaming support
Trainer Full training loop with logging, evaluation, checkpointing
PEFT (LoRA, QLoRA) Fine-tune large models on limited hardware
accelerate Distribute training across GPUs with minimal code changes
huggingface_hub Upload/download models, datasets, and files programmatically
evaluate Standardized metrics (accuracy, BLEU, ROUGE, etc.)

Hugging Face also integrates with:

  • PyTorch, TensorFlow, JAX — Transformers models work across all three
  • LangChain — use HF models as components in LLM chains and RAG pipelines
  • Sentence Transformers — specialized library for text embeddings, built on HF
  • GGUF / llama.cpp — quantized model formats hosted on the Hub for local inference
  • Weights & Biases, MLflow — experiment tracking integrations

Model hosting and inference

Hugging Face offers several options for running models beyond your local machine:

  • Inference API: Send HTTP requests to hosted models for quick prototyping — no GPU required on your end. Free tier available for many models.
  • Inference Endpoints: Deploy dedicated, scalable endpoints for production workloads. You choose the hardware (CPU, GPU, etc.) and pay for what you use.
  • Spaces: Host interactive demos powered by Gradio or Streamlit. Useful for sharing results with collaborators or embedding demos in papers and presentations.

UW-Madison on Hugging Face

UW-Madison has a verified organization on Hugging Face: huggingface.co/uw-madison. It hosts models developed by UW researchers, including efficient transformer architectures like YOSO and Nystromformer. If your lab is publishing models, consider hosting them under this org for visibility and discoverability. There are also department-level orgs like UW-Madison-Radiology.

Installation

# Core library
pip install transformers

# Common companion libraries
pip install datasets evaluate accelerate huggingface_hub

# For parameter-efficient fine-tuning
pip install peft trl

Questions?

If you have any lingering questions about this resource, please feel free to post to the Nexus Q&A on GitHub. We will improve materials on this website as additional questions come in.