Glossary of Categories

Author

ML+X

Modified

August 21, 2025

Glossary of Categories

Term	Meaning	Description
Agriculture	N/A	The application of machine learning in agricultural practices, including crop prediction, pest detection, and yield optimization. Often involves satellite imaging and IoT data.
Biophysics	N/A	The study of biological processes through the methods of physics. Machine learning is increasingly applied in areas like protein structure prediction and molecular dynamics simulations.
Books	N/A	Comprehensive resources for learning and reference. Often written by experts, these provide in-depth coverage of machine learning topics.
Carpentries	N/A	Hands-on, interactive learning sessions focusing on foundational coding and data skills. Facilitates skill acquisition through real-world examples and active learning.
Classical ML	Classical Machine Learning	Refers to traditional ML algorithms like SVMs, decision trees, and k-means clustering. These methods are well-established, interpretable, and often less resource-intensive than deep learning.
Clustering	N/A	The process of grouping a set of objects in such a way that objects in the same group (cluster) are more similar to each other than to those in other groups. Common algorithms include k-means and DBSCAN.
CNN	Convolutional Neural Network	A class of deep learning models particularly well-suited for image processing tasks. Known for their ability to automatically learn spatial hierarchies of features. Common libraries include TensorFlow and Keras.
CNN-LSTM	Convolutional Neural Network - Long Short-Term Memory	A hybrid model combining CNNs for feature extraction from images or sequences and LSTMs for handling time dependencies. Used in applications like video classification.
Code-along	N/A	Live coding sessions where learners write code simultaneously with the instructor, which enhances learning through practice and immediate application of concepts.
Compute	N/A	Refers to computational resources, often in the context of machine learning, such as CPUs, GPUs, and cloud computing. Critical for training models, especially deep learning models.
Computer vision	N/A	A field of machine learning focused on enabling computers to interpret and make decisions based on visual data. Applications include image recognition, object detection, and facial recognition. Libraries include OpenCV, PyTorch, and TensorFlow.
Contribute	N/A	Involves contributing to open-source projects, fostering collaboration, improving software quality, and providing community support.
CSI	Cover Song Identification	A task in music information retrieval that involves identifying whether one song is a cover version of another. Machine learning models for CSI often analyze harmonic, melodic, and rhythmic similarities across versions. Popular approaches include feature-based methods and deep learning models.
Cross Labs AI	N/A	Refers to interdisciplinary AI research and development across various labs or research groups, often involving collaboration between different fields such as computer science, biology, and engineering.
Deep learning	N/A	A subset of machine learning involving neural networks with many layers, used for complex tasks like image recognition and natural language processing. Commonly used libraries include TensorFlow, PyTorch, and Keras.
Drug synergy	N/A	The study and identification of drug combinations that produce a greater effect together than individually. Machine learning is used to predict synergistic drug pairs.
Empirical patterns	N/A	Observed patterns in data identified through experimentation and analysis. These patterns help inform the development of models and algorithms.
Forums	N/A	Online platforms where members of the machine learning community can ask questions, share knowledge, and collaborate on projects. Examples include Reddit, Stack Overflow, and specialized forums like Kaggle.
Foundation models	N/A	Large-scale pre-trained models that serve as a base for fine-tuning on specific tasks. They underpin many state-of-the-art NLP and computer vision systems. Examples include GPT-3, BERT, and CLIP.
Genomics	N/A	The study of genomes, often involving the analysis of DNA sequences. Machine learning aids in tasks like gene prediction, mutation analysis, and personalized medicine. Libraries include Scikit-learn, TensorFlow, and PyTorch.
Git/GitHub	N/A	Version control system (Git) and the associated platform (GitHub) for hosting and sharing code. Essential tools for collaboration and project management in software development, including ML projects.
Grokking	N/A	A phenomenon in machine learning where a model unexpectedly generalizes well after many training iterations, often after initially performing poorly. Highlights the non-linear relationship between training time and model performance.
Guides	N/A	Detailed instructions or explanations, often in the form of tutorials or documentation, aimed at helping users understand and apply specific concepts or tools.
GPU	Graphics Processing Unit	A specialized processor that accelerates the creation of images in a frame buffer intended for output to a display. Widely used in deep learning for parallel processing capabilities.
Healthcare	N/A	The application of machine learning in the healthcare industry, including areas like medical imaging, diagnostics, and personalized treatment plans. Common challenges include data privacy and interpretability.
Hugging Face	N/A	A popular platform for sharing pre-trained models, datasets, and other machine learning resources, especially in NLP. Provides tools like the Transformers library for easy model deployment.
Industry applications	N/A	Refers to the use of machine learning across various industries such as finance, manufacturing, and logistics, for tasks like predictive maintenance, fraud detection, and supply chain optimization.
Keras	N/A	A high-level neural networks API written in Python, capable of running on top of TensorFlow, Theano, or CNTK. It allows for easy and fast prototyping of deep learning models.
Knowledge-based	N/A	Refers to models or systems that incorporate domain-specific knowledge, often encoded in rules or ontologies, to improve decision-making or interpretability. Examples include expert systems and knowledge graphs.
LLaVA	Large Language and Vision Assistant	A multimodal AI model that combines language and vision understanding, capable of processing and generating both text and images.
LLM	Large Language Model	A type of deep learning model that can process and generate human-like text by understanding context from vast amounts of data. Examples include GPT-3 and BERT.
LMM	Large Multimodal Model	A class of models that can process and generate content across different modalities such as text, image, and audio. Examples include CLIP and DALL-E.
Medical imaging	N/A	The application of machine learning techniques to analyze and interpret medical images. Common tasks include segmentation, classification, and anomaly detection. Libraries include MONAI, PyTorch, and TensorFlow.
Model exploration	N/A	Libraries that facilitate trying out different model architectures and pretrained models.
NLP	Natural Language Processing	A field of machine learning focused on the interaction between computers and humans using natural language. Common libraries include NLTK, SpaCy, and Hugging Face Transformers.
OOD detection	Out-of-Distribution Detection	Techniques for identifying data points that do not belong to the distribution on which a model was trained. Important for building robust and trustworthy models. Common methods include Mahalanobis distance and energy-based models.
PyTorch	N/A	An open-source machine learning library based on the Torch library, primarily used for applications such as computer vision and natural language processing. Developed by Facebook’s AI Research lab.
Python	N/A	A high-level programming language that has become the de facto standard for machine learning and data science due to its readability and vast ecosystem of libraries (e.g., NumPy, Pandas, Scikit-learn, TensorFlow).
Reproducibility	N/A	Ensuring that ML experiments and results can be consistently replicated by others, a key principle in scientific research. Often involves detailed documentation, version control, and the use of containers.
Sklearn	Scikit-learn	A machine learning library for Python that provides simple and efficient tools for data mining and data analysis, built on NumPy, SciPy, and Matplotlib.
Text analysis	N/A	The process of deriving information from text data, often involving techniques from NLP. Used in sentiment analysis, topic modeling, and more. Common libraries include NLTK, SpaCy, and Gensim.
Trustworthy ML	N/A	Approaches and practices in machine learning that ensure models are reliable, fair, and transparent, especially in critical applications like healthcare or finance.
Udacity	N/A	An online learning platform offering courses, including nanodegree programs, on a variety of topics including machine learning and data science. Often includes projects and code-alongs.
ViT	Vision Transformer	ViT is based on the transformer architecture, originally designed for NLP tasks but adapted for vision tasks. Images are split into patches (usually 16x16), which are flattened and treated as input tokens. These patches are then processed similarly to how words are processed in NLP transformers like BERT.

VLM | Visual-Language Model | A type of machine learning model that understands and generates content based on both visual and textual inputs, often used in tasks like image captioning and visual question answering. Examples include CLIP and LLaVA. |