AI Glossary

AI Glossary Accordion
AI Model Card
A documentation template describing the details, limitations, and intended use of an AI model.
Active Learning
A process where the model selects the most informative data points to be labeled by a human.
Aleph Alpha (Luminous)
A German-developed LLM known for transparency, explainability, and European language capabilities.
Algorithm
A step-by-step procedure for solving a problem or performing a computation.
Alpaca
An instruction-tuned LLaMA model developed by Stanford for educational and research use.
Anomaly Detection
Identifying unusual data points that differ from the norm.
Anthropic Claude
A family of AI assistants from Anthropic focused on safety and long-context understanding.
Artificial General Intelligence (AGI)
A type of AI with the ability to understand, learn, and apply knowledge across a wide range of tasks.
Artificial Intelligence (AI)
The simulation of human intelligence processes by machines, especially computer systems.
Artificial Narrow Intelligence (ANI)
AI that is specialized in one specific task.
Artificial Superintelligence (ASI)
A hypothetical AI that surpasses human intelligence across all fields.
Attention Mechanism
Technique allowing models to focus on relevant parts of input data, essential in transformer models.
AutoML (Automated Machine Learning)
Tools that automate the end-to-end process of applying ML to real-world problems.
Avatar
A digital representation or character used to represent a user or persona, often used in AI-generated video or virtual platforms.
BERT (Bidirectional Encoder Representations from Transformers)
A transformer-based model that understands context in both directions.
Backpropagation
A method used to train neural networks by updating weights based on error.
Baidu ERNIE Bot
A Chinese LLM by Baidu, trained with large-scale data from the Chinese internet.
Bard
An AI chatbot developed by Google, based on its language models and later integrated into Gemini.
Bias
Systematic error introduced by an assumption in the machine learning process.
Bias Mitigation
Techniques used to reduce bias in AI models.
Black Box
A model whose internal workings are not visible or understandable.
Canva
A graphic design platform that integrates AI tools for content creation, design suggestions, and image generation.
Chain of Thought
A prompting technique where models are guided to reason through intermediate steps to reach an answer.
ChatGPT
An AI chatbot developed by OpenAI based on the GPT series of language models.
Chatbot
A software application used to conduct an online chat conversation via text or text-to-speech.
Classification
Assigning inputs into predefined categories.
Clustering
Grouping similar data points together in unsupervised learning.
Codex
An OpenAI model trained on code that powers GitHub Copilot.
Cohere Command R+
An LLM optimized for retrieval-augmented generation (RAG) tasks in enterprise applications.
Computer Vision
A field of AI that trains computers to interpret and understand the visual world.
Concept Drift
When the statistical properties of target variables change over time, affecting model performance.
DALL·E
An AI model from OpenAI that generates images from natural language descriptions.
Data Augmentation
Techniques used to increase the amount and diversity of data.
Data Preprocessing
Steps taken to clean and prepare data before training a model.
Databricks Dolly
An open instruction-following model based on GPT-J, designed for commercial use.
Deep Learning
A type of machine learning using neural networks with many layers.
Diffusion Model
A generative model that learns to create data (like images) by reversing a noise process.
Dimensionality Reduction
Techniques to reduce the number of input variables in a dataset.
Embedding
A representation of text or data in a dense vector space.
Embeddings Store
A searchable database of vector embeddings, used in semantic search and retrieval-augmented generation.
Epoch
One complete pass through the entire training dataset.
Ethical AI
AI developed and deployed in a way that respects human rights, fairness, and accountability.
Explainable AI (XAI)
AI systems designed to explain their decisions to humans.
Facebook LLaMA 2
Meta’s open-weight LLM family known for competitive performance and broad adoption.
Feature Engineering
The process of selecting and transforming variables to improve model performance.
Federated Learning
Training machine learning models across decentralized devices or servers.
Few-shot Learning
Learning from a small number of examples.
Fine-tuning
Training a pre-trained model on a specific task or dataset.
GPT (Generative Pre-trained Transformer)
A generative language model developed by OpenAI.
Gemini
Google’s next-generation AI model and platform that integrates text, image, and code understanding.
Generative AI
AI systems that can create new content, such as text, images, or music.
Gradient Descent
An optimization algorithm used to minimize the loss function in training.
Grok
An AI chatbot developed by xAI, a company founded by Elon Musk, integrated into the X platform (formerly Twitter).
Grok
An AI chatbot developed by Elon Musk’s xAI and integrated into the X platform.
Guardrails (in AI)
Mechanisms to enforce ethical, safety, or behavioral constraints in AI models.
Hallucination (in AI)
When an AI generates output that is plausible but factually incorrect or nonsensical.
Heygen
An AI video generation platform that creates realistic avatars for personalized and business communication.
Hugging Face Open LLMs
A collection of community-built and fine-tuned models hosted on Hugging Face.
Hyperparameter
Settings used to control the training process of a model.
Inference
The process of using a trained model to make predictions.
Knowledge Graph
A structured representation of facts, entities, and relationships used for reasoning and inference.
LLaMA 3
Meta’s anticipated next-generation open LLM, expected in 2025.
Large Language Model (LLM)
A type of generative AI trained on vast amounts of text to understand and generate human-like text.
Latency (in inference)
The time delay between inputting data and receiving a model’s output.
LoRA (Low-Rank Adaptation)
A technique for efficiently fine-tuning large language models using fewer resources.
Loss Function
A function that measures the error between predicted and actual outcomes.
MOE (Mixture of Experts)
A neural network architecture that selectively activates parts of the model for efficiency and performance.
Machine Learning (ML)
A subset of AI that involves the use of algorithms and statistical models to enable machines to improve at tasks with experience.
Microsoft Phi
A small yet powerful model from Microsoft, optimized for efficiency and reasoning.
MidJourney
An AI-powered image generation tool that creates artwork from text prompts.
Mistral 7B / Mixtral 8x7B
High-performing open-weight models using sparse mixture-of-experts architecture.
Model
A mathematical representation of a real-world process, trained to make predictions or decisions.
Model Distillation
The process of creating a smaller model that mimics a larger, more complex one.
Multimodal AI
AI systems that understand and process multiple data types (e.g., text + image).
Natural Language Processing (NLP)
A field of AI that gives machines the ability to read, understand, and respond in human languages.
Neural Network
A network of artificial neurons that mimic the human brain’s structure to process data.
OpenAI GPT-3.5
A predecessor to GPT-4, widely used for general-purpose tasks.
OpenAI GPT-4 / GPT-4 Turbo
Advanced models known for reasoning, code generation, and multimodal input.
OpenAI Whisper
A model for automatic speech recognition, not an LLM but often used alongside them.
Overfitting
A modeling error which occurs when a model is too complex and captures noise in the data.
PaLM 2
A large model developed by Google that powers early versions of Bard.
Prompt Engineering
Crafting input prompts to effectively interact with language models.
Prompt Injection
A vulnerability where malicious prompts are embedded in input data to manipulate AI behavior.
RLHF (Reinforcement Learning from Human Feedback)
A training method that aligns model outputs with human preferences.
Regression
Predicting continuous values from input data.
Reinforcement Learning
A type of machine learning where agents learn to make decisions by receiving rewards or penalties.
Retrieval-Augmented Generation (RAG)
Combines language models with a retrieval system for better-informed responses.
Self-supervised Learning
A method where the model generates its own labels from raw data during training.
Shot Learning
Refers to zero-shot, one-shot, or few-shot learning, describing how many examples a model needs to perform a task.
StableLM
An open-source LLM by Stability AI, designed for creativity and code.
Supervised Learning
A type of machine learning where models are trained on labeled data.
Synthetic Data
Artificially generated data used to train AI models.
Token Limit
The maximum number of input and output tokens a model can process at once.
Tokenization
The process of splitting text into smaller pieces (tokens), often words or subwords.
Training Data
The dataset used to train an AI model.
Transfer Learning
Using a pre-trained model on a new, but related problem.
Transformer
A deep learning architecture that uses self-attention; forms the backbone of models like GPT and BERT.
Tsinghua GLM
A multilingual LLM developed by researchers at Tsinghua University.
Turing Test
A test to determine whether a machine can exhibit human-like intelligence.
Underfitting
A scenario where a model is too simple to capture the underlying pattern of the data.
Unsupervised Learning
Machine learning using data that has not been labeled or categorized.
Variance
The model’s sensitivity to small changes in the training dataset.
Vector Database
A specialized database optimized for storing and searching high-dimensional vectors.
Vicuna
A fine-tuned LLaMA model that performs well in dialogue and chat scenarios.
White Box
A model whose decisions and internal logic are transparent.
WizardLM
A family of instruction-tuned models built on top of LLaMA with improved reasoning.
Zero-shot Learning
Making predictions without the model having seen any examples of the task.
xAI Grok
Elon Musk’s LLM integrated with real-time data from X (formerly Twitter).