AI Terminology 100

🧠

A2A Protocol

Application-to-Application communication method. QA may test for these automated interactions are secure and behave as expected.

📦

Acceptance Thresholds (Metrics)

Minimum performance levels for release. The product team defines these threholds and QA would assess if the model is ready for deployment.

🪄

Accuracy (Metric)

Percentage of correct predictions. QA uses it for assessing overall model correctness on test sets.

⚙️

Adaptive Testing

Testing that changes as the AI evolves. QA continuously adjusts test scenarios to validate model evolving behavior.

🎯

Adversarial

Intentional input manipulation to mislead models. QA tests model robustness against such attacks.

🖼️

AGI

Artificial General Intelligence: A theoretical AI that can understand and perform any task like a human. It’s essential for understanding AI evolution.

⚙️

AI Agents

Autonomous components that can make decisions and perform tasks without human intervention. QA may need to test for autonomy, reliability, and safety.

📦

AI Alignment

Ensuring AI behaviors stay aligned with human goals and ethical expectations. QA often checks for safety, fairness, and unintended behaviors.

🚀

AI Model

A trained system (e.g., neural network) that maps inputs to outputs. Testers often validate it against expected performance, accuracy, and edge cases.

🌐

AI Wrapper

A software layer or application that simplifies access to AI models. QA would test like testing any software application in the context of AI-driven functionalities.

📝

Algorithm

The underlying set of rules or logic that the AI follows. Testers may verify logical consistency and traceability.

🔍

ANN (Artificial Neural Network)

A machine learning model inspired by the brain. QA may test the outputs for accuracy, edge case handling, overfitting/underfitting issues, etc.

📝

Artificial Intelligence

Computer systems that mimic human intelligence to solve problems. QA testers evaluate how reliably and safely they perform tasks.

📊

Assistant Role (In Prompt Engineering)

The AI’s response behavior. QA tests for correctness, tone, and task-following accuracy.

💡

Autonomous

Refers to systems that operate independently. QA tests to validate if agents predictably as expected and safely.

📝

Bias

When a model shows unfair preferences. QA tests for performance across diverse groups to detect bias.

🎯

Bidirectional Testing

Evaluates how input affects output and how outputs may influence subsequent inputs or prompts. QA tests both directions to detect inconsistencies or circular dependencies.

🌐

Chain-of-Thought (CoT) Prompting

Encourages the AI to reason step-by-step. QA checks each reasoning step for correctness and coherence.

📦

Chatbot

An AI system that interacts with users in natural language. Testers evaluate language understanding, edge cases, and intent coverage.

📊

CNN (Convolutional Neural Network)

A neural network best for image processing. QA tests image classification accuracy and robustness to distortions.

🤖

Computer Vision

AI interpreting visual input. QA tests include edge detection, object recognition accuracy, and robustness to noise.

📚

Concept Drift

This happens when what a model is predicting starts to mean something different over time—like when “fraud” in financial data changes due to new scam tactics. QA testers check if the model’s predictions still match current realities and whether labels need to be updated.

🌐

Context

Relevant data history an AI uses for decision-making. Testers test for it’s properly maintained and used.

🛠️

Context Injection Testing

Simulates scenarios where contextual inputs could manipulate or interfere with system prompts. QA verifies prompt isolation and defense layers.

🚀

Context Window

The max input text an LLM can consider. QA checks truncation effects and response degradation.

🚀

Continuous Learning

AI improves by learning from new data. QA tests for model updates while it does not introduce regressions.

📈

Co-piloting

AI-assisted development or testing support. QA validates suggestion accuracy, usefulness, and workflow integration.

🛠️

Customize ChatGPT

Modifying ChatGPT’s behavior using system instructions. It’s a useful tool to customize ChatGPT for better quality of the outputs.

🚀

Data Dependency

AI relies heavily on training data. QA test for diverse, representative, and high-quality datasets.

🌐

Data Drift

When incoming data shifts from training data. QA detects shifts that could lead to performance and accuracy drops.

🪄

Data Science

Field focused on extracting insights from data. QA may validate models built by data scientists and the correctness of data pipelines.

📚

Dataset

A collection of data used for training or testing models. QA checks data quality, labeling accuracy, and bias.

📈

Deep Learning

Subset of ML with layered neural networks. QA may test for generalization, overfitting, and inference correctness.

🤖

Embedding

Numeric vectors that represent words/phrases. QA validates clustering, similarity scores, and semantic consistency.

📈

Ethics or Ethical Considerations

Responsible AI usage. QA tests for bias, misuse, and harmful outcomes.

🛠️

Explainability

How well humans can understand AI decisions. QA checks that explanations are available, truthful, and useful.

🔒

F1-Score (Metric)

Balance between precision and recall. QA uses it for performance in imbalanced datasets.

💡

Fairness

Equity in model outputs across users. QA tests for no group is disadvantaged unfairly.

🔒

Few-shot Prompting

AI is given a few examples before answering. QA validates learning from small examples and matching output format.

💡

Fine-tuning

Additional training on specific data. QA compares pre- and post-finetuned performance.

🪄

Foundation Model

A large pre-trained model adaptable to many tasks. QA evaluates adaptability, efficiency, and risks of over-generalization.

🖼️

Frequency Penalty

Reduces repetition by lowering likelihood of repeated words. QA evaluates its effect on redundancy and coherence.

🔒

G.U.I.D.E. Prompting Formular

A structured way (Goals-Users-Instrucions-Details-Examples) to design effective prompts. QA uses it construct better prompts to elicit consistent and relevant AI outputs.

💡

Generative AI

AI that creates content (text, images, etc.). QA validates creativity vs. control, safety, and factuality.

🧠

GPU

Graphics hardware used to run AI efficiently. QA may monitor GPU utilization and performance under load.

🖼️

Ground Truth

The labeled data used as the correct answer during testing. Testers validate model outputs against ground truth.

🚀

Hallucination

When the model generates false or misleading content. Testers identify and report incorrect or nonsensical outputs.

🔒

Human-in-the-Loop (HITL) Testing

Humans validate or correct model decisions. QA includes HITL as part of accuracy assurance.

📈

Inference

The model making predictions from inputs. QA test for predictions are accurate, fast, and explainable.

📊

Instruction Collision Testing

Tests what happens when multiple instructions conflict. QA checks if the model prioritizes or blends conflicting directions responsibly.

🪄

LLM (Large Language Model)

A massive text-trained model like GPT-4. QA tests outputs for correctness, reliability, and adherence to constraints.

📦

Machine Learning

AI that learns from data. QA tests ML systems across different datasets, distributions, and use cases.

🧠

Max Tokens

The maximum number of words or characters the AI can generate. QA testers validate that responses respect this limit and don’t truncate unexpectedly.

🔍

MCP (Model Context Protocol)

A protocol framework for passing background information—like user history, settings, or task data—to AI models so they can respond more accurately. QA testers validate that this context is delivered correctly and that the model behaves appropriately based on it.

🔍

MLOps

Operational practices for ML lifecycle. QA may test deployment stability, versioning, and monitoring processes.

🔒

Model Architecture

The design or structure of the AI model. QA may test performance differences across architectures.

⚙️

Model Drift

When model behavior changes over time. QA monitors for accuracy degradation or unexpected predictions.

📚

Multimodal Testing

Validates models that use more than one input type (e.g., text + image). QA checks if the system processes and combines inputs correctly.

🧠

Natural Language Processing (NLP)

AI that processes and understands human language. Testers validate intent recognition, language accuracy, and contextual responses.

📚

Neural Network

A type of model inspired by the human brain. QA may test for learning is effective and interpretable where possible.

⚙️

Overfitting

When the model performs well on training data but poorly on new data. QA uses holdout or test sets to detect it.

💡

Parameters

The internal weights the model learns. QA doesn’t directly test them, but they influence output quality.

🤖

Precision (Metric)

Correctness among positive predictions. QA uses it to check for false positives.

🔍

Presence Penalty

Discourages using words already mentioned. QA checks for novelty and logical flow in results.

📊

Pre-trained Transformer

A model already trained on large data and adapted for new tasks. QA tests adaptation quality and residual knowledge.

🔍

Pretraining

Initial large-scale training on generic data. QA may test for knowledge transfer during fine-tuning.

🔍

Privacy

Protection of user data. QA checks for data leaks, logging issues, and compliance with data handling policies.

🌐

Probabilistic Outputs

AI provides outputs with varying confidence. QA evaluates certainty thresholds and result variation.

🪄

Prompt Engineering

Crafting input prompts to guide model behavior. QA creates prompt suites to test consistency and compliance.

🧩

Prompt Entanglement Testing

Examines whether earlier prompts unintentionally affect later responses. QA ensures prompt sessions are scoped correctly.

🧠

RAG (Retrieval-Augmented Generation)

A model architecture that uses search results to enhance output. QA validates grounding, retrieval accuracy, and source attribution.

🪄

Reasoning Model

Models designed for logical tasks. QA tests chain of thought, deduction accuracy, and reasoning robustness.

🧩

Recall (Metric)

Ability to find all relevant positives. QA uses it to check for false negatives.

🧠

Reinforcement Learning

Learning through reward signals. QA verifies stability, convergence, and policy safety.

🤖

Repeatability Testing

Checks whether a model consistently gives the same output when run multiple times under the same conditions. QA uses it to assess determinism and debugging reliability.

🪄

Response Format

The structure or schema of the model’s output (e.g., JSON, markdown). QA verifies that formatting meets application needs and is parseable.

🛠️

RNN (Recurrent Neural Network)

A network for sequential data (e.g., text, time series). QA checks for memory of past inputs and proper sequence predictions.

🔒

Robustness

Stability of model outputs under noisy, diverse, or adversarial inputs. QA stress-tests edge conditions and rare cases.

⚙️

Semi-supervised Learning (Algorithm)

A method using both labeled and unlabeled data. QA may tests for generalization and monitors performance gaps.

🖼️

Sensitivity Analysis

Analyzes how small changes in input affect output. QA uses it to identify fragile or overly sensitive behaviors.

📚

Stop Sequences

Text that tells the AI where to stop generating. QA verifies proper cutoffs to avoid extra or incomplete responses.

🤖

Supervised Learning

Training with labeled data. QA uses test sets and metrics (accuracy, F1) to validate.

🚀

System Role (In Prompt Engineering)

Defines the AI’s behavior context (e.g., expert, assistant). QA test for role consistency and output alignment.

🤖

Temperature

Affects randomness—higher values produce more creative, varied responses. QA tunes this for output consistency vs creativity.

🎯

Text-to-Image Generation

Creating images from text prompts. QA assesses image quality, prompt alignment, and potential harm.

🔍

Tokenization

Splitting text into tokens for processing. QA test for token limits and structure aren’t broken.

🔍

Top P

Controls randomness by limiting tokens to a cumulative probability. QA tests how this affects output diversity and determinism.

🌐

TPU

Tensor Processing Unit – Google’s AI chip.

📦

Training

The full process of model learning. QA checks for learning progress, convergence, and generalization.

🤖

Training Data

The dataset used to teach the model. QA assesses quality, diversity, and labeling accuracy.

💡

Transformer

Model design used in most LLMs. QA focuses on attention behaviors, context limits, and generation stability.

⚙️

Transformer Model

A model using attention mechanisms. QA test for handling of long sequences and correct interpretation.

🔍

Tree-of-Thought (ToT) Prompting

AI explores multiple reasoning paths like decision trees. QA test for completeness and evaluates path quality.

📚

Unsupervised Learning

Learning without labeled data. QA checks discovered patterns and use in downstream tasks.

🌐

User Role (In Prompt Engineering)

The prompt or question from the human user. QA designs diverse prompts to validate AI understanding.

🔒

Variance/Configuration Testing

Tests how changes in settings (e.g., temperature, top_p) affect model behavior. QA evaluates performance and stability across configurations.

🔍

Vibe Coding

Natural language-driven coding tools. QA tests IDE integration, code accuracy, and developer prompts.

📈

Weights

Numerical values the model adjusts during training. QA indirectly assesses them via output quality.

💡

Zero-Shot Learning

Making predictions without task-specific training. QA assesses generalization and failure cases.

📚

Zero-shot Prompting

AI performs a task without specific examples. QA evaluates if it handles generalization effectively.

100 Common AI terminologies for QA

Stay Connected with Shade of Hue

Company

Links