Hugging Face

Libraries

Hugging Face

NLP

Deep learning

Foundation models

LLM

Computer vision

Model exploration

Model sharing

PyTorch

Tensorflow

Multimodal learning

Author

Chris Endemann

Published

August 1, 2024

Hugging Face is an open-source platform and ecosystem for machine learning. It’s best known for the Transformers library, but it has grown into a one-stop shop for finding, sharing, and running models, datasets, and demos across NLP, computer vision, audio, and multimodal tasks. If you’re doing ML research or building an ML application, chances are you’ll interact with Hugging Face at some point.

The platform hosts over 900,000 models and 200,000 datasets contributed by the community, along with tools for training, fine-tuning, evaluation, and deployment. It integrates natively with PyTorch, TensorFlow, and JAX.

Key features

Model Hub: Browse, search, and download pretrained models for virtually any task — text generation, classification, object detection, speech recognition, and more. Every model has a model card with documentation, usage examples, and performance details.
Datasets library: Load thousands of community-contributed datasets with a single line of code using the datasets library. Supports streaming for large datasets that don’t fit in memory.
Pipelines: The pipeline() API provides a high-level interface for common tasks (sentiment analysis, summarization, image classification, etc.) — often just 2-3 lines of code to go from zero to inference.
Spaces: Host interactive ML demos using Gradio or Streamlit directly on Hugging Face. Great for sharing prototypes, class projects, or research demos without managing infrastructure.
Fine-tuning and training: The Trainer API and integration with PEFT (parameter-efficient fine-tuning) and TRL (reinforcement learning from human feedback) make it straightforward to adapt models to your data.

Finding and downloading models

The Model Hub is the main entry point. You can filter by task, library, language, license, and more. Each model page includes:

A model card describing the architecture, training data, intended use, and limitations
Usage snippets you can copy directly into your code
Community discussion and version history

To download and use a model in Python:

from transformers import pipeline

# High-level: use a pipeline for common tasks
classifier = pipeline("sentiment-analysis")
result = classifier("Hugging Face makes ML accessible!")

# Or load a specific model and tokenizer
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")

For non-Transformers models (e.g., scikit-learn, spaCy, or custom frameworks), the huggingface_hub library provides download/upload utilities:

from huggingface_hub import hf_hub_download

# Download a specific file from any repo
path = hf_hub_download(repo_id="username/model-name", filename="model.pkl")

Useful functions and integrations

Function / Tool	What it does
`pipeline()`	One-liner inference for 30+ task types
`AutoModel` / `AutoTokenizer`	Auto-detect and load the right model class
`datasets.load_dataset()`	Load any Hub dataset with streaming support
`Trainer`	Full training loop with logging, evaluation, checkpointing
`PEFT` (LoRA, QLoRA)	Fine-tune large models on limited hardware
`accelerate`	Distribute training across GPUs with minimal code changes
`huggingface_hub`	Upload/download models, datasets, and files programmatically
`evaluate`	Standardized metrics (accuracy, BLEU, ROUGE, etc.)

Hugging Face also integrates with:

PyTorch, TensorFlow, JAX — Transformers models work across all three
LangChain — use HF models as components in LLM chains and RAG pipelines
Sentence Transformers — specialized library for text embeddings, built on HF
GGUF / llama.cpp — quantized model formats hosted on the Hub for local inference
Weights & Biases, MLflow — experiment tracking integrations

Model hosting and inference

Hugging Face offers several options for running models beyond your local machine:

Inference API: Send HTTP requests to hosted models for quick prototyping — no GPU required on your end. Free tier available for many models.
Inference Endpoints: Deploy dedicated, scalable endpoints for production workloads. You choose the hardware (CPU, GPU, etc.) and pay for what you use.
Spaces: Host interactive demos powered by Gradio or Streamlit. Useful for sharing results with collaborators or embedding demos in papers and presentations.

UW-Madison on Hugging Face

UW-Madison has a verified organization on Hugging Face: huggingface.co/uw-madison. It hosts models developed by UW researchers, including efficient transformer architectures like YOSO and Nystromformer. If your lab is publishing models, consider hosting them under this org for visibility and discoverability. There are also department-level orgs like UW-Madison-Radiology.

Installation

# Core library
pip install transformers

# Common companion libraries
pip install datasets evaluate accelerate huggingface_hub

# For parameter-efficient fine-tuning
pip install peft trl