This glossary provides a unified reference for the core terminology used across the Vionlabs platform, APIs, and documentation.
It defines how we describe entities, processes, and concepts within the product, ensuring consistent understanding across engineering, product, and customer teams.
The terms are organized by domain — including Business Concepts, Product Features, Technical Terms, Platform Components, and 3rd-Party Systems — and cover both operational definitions and technical abbreviations.
Use this glossary to:
- Clarify how platform entities like Catalog, Asset, Product, and FeatureCode relate to one another.
- Understand the scope of processing terms used throughout the API and data model.
- Reference acronyms and domain-specific terms when integrating or building on top of the Vionlabs platform.
- This page is continuously updated as the platform evolves to include new capabilities, features, and technical definitions.
Business Domain
Term |
Acronym |
Definition |
Account |
– |
A logical subdivision within an organization. Used to structure access, e.g., for subsidiaries, brands, or sandbox use cases. |
Asset |
– |
A piece of digital content (video, audio, image) managed within a catalog or content library. |
Catalog |
– |
A customer's media library. Contains metadata such as name and description and is the root entity for all processing. Linked 1:1 to an |
Operation |
– |
Metadata configuration for a catalog. Defines processing flags like |
ApiKey, ApiKeyPermission |
– |
Stores customer-facing API keys (sensitive info! it is preferred to encrypt it before handing over to customer) and their permissions. Keys are linked to a catalog and grant access to result APIs. |
CatalogApiConfig |
– |
Per-catalog configuration for result APIs. Includes result versions, flags ( |
CatalogFeatureConfig |
– |
Manages active feature codes per catalog. Includes |
FeatureCode |
FC |
Internal identifier for a unit of work in the pipeline. Defines Docker image, feature type, dependencies, and scheduling metadata. |
Inventory |
– |
Media items (movies, episodes, series) within a catalog. Each entry links to related processing jobs and status. |
AssetFeatureStatus |
– |
Tracks feature processing status per asset. Serves as the backend unit behind the Job entity. |
Job |
– |
Logical grouping of |
AvailableTask |
– |
Defines which product/feature versions are available for task creation via the Task API. Includes priority and activation status. |
Product |
– |
A view combining |
UserCatalog |
– |
Many-to-many mapping between users and catalogs. Includes default catalog flag for UI display. |
Operation History |
– |
Logs admin changes and configuration updates made in the Django Admin portal. |
Reports |
– |
Analytics and billing data. Reports include ingestion volume, feature usage, processing time, and order-based billing events. |
Customer / Organization |
– |
The end-user or company consuming Vionlabs' technology for AI-driven content operations. |
Episode |
– |
A single unit in a serialized show. |
Genre |
– |
A content category (e.g., Drama, Comedy) used for classification and recommendation. |
IMDB / TMDB |
– |
Widely-used databases offering movie metadata, often used for enrichment or cross-reference. |
Character Tracking |
– |
AI-based method for identifying and following characters through a video. |
Intro / Recap / Credit Detection |
– |
AI-powered tagging of intros and credits to enable skipping or repurposing scenes. |
Keywords |
– |
Descriptive tags derived from video/audio/text to improve discoverability and search relevance. |
Movie |
– |
A full-length video asset analyzed with scene-level insights. |
Mood |
– |
The emotional tone of a scene (e.g., sad, tense, joyful), generated via multimodal analysis. |
Mood Tags |
– |
Metadata labels that describe the emotional content of a scene or title. |
Series |
– |
A multi-episode content format, each installment processed individually or as part of a set. |
Subtitles |
– |
Text version of spoken dialogue, usually in the original or a translated language, displayed on-screen. |
Synopsis |
– |
A short AI-generated summary of a video’s story, used for previews and catalogs. |
Thumbnail |
– |
A representative image, typically AI-selected, used as the visual cue for content in a catalog or interface. |
Product Features
Term |
Acronym |
Definition |
Adbreaks |
– |
Given time series of audio (vggish) and video (inception) features, predicts seconds where adbreaks should occur. |
Adbreaks-Multi |
– |
Collects the different components for scene-boundary detection and aligns with speech to produce adbreaks. No inference performed. |
Binge Markers |
BM |
AI-detected timestamps like intros, recaps, and credits that enable seamless binge-watching. |
Contextual AdBreak |
– |
Ad placement at relevant moments in video content, based on mood and narrative context. |
Fingerprint+ |
FP+ |
Enhanced fingerprinting using multimodal embeddings and context-aware metadata. |
Jit Preview |
JIT (JP) |
Dynamically generated preview clips based on user query context. Generated only when requested. |
Jit Thumbnail |
JIT (JT) |
Thumbnails rendered in real-time based on the most relevant moment at request time. |
Metadata |
– |
Descriptive video data (e.g., mood, actors, keywords) used for search, sort, and recommendation. |
Montage / Trailer |
– |
A sequence of selected video clips edited into a promotional or summary format. |
Preview Clip |
– |
Short, auto-generated segment of a video optimized to encourage viewing. |
Scene Search |
– |
Search functionality that enables users to find scenes based on content characteristics. |
Character |
– |
Character tracking combining clustering, facial recognition, and classical object tracking algorithms. |
Intro |
– |
Detects start and end times of the intro in a video based on multimodal features. |
Recap |
– |
Detects start and end times of the recap in a video. |
Credit |
– |
Detects start and end times of the credit section in a video. |
Keyword |
– |
Predicts keywords and mood tags based on multimodal embeddings. |
Mood |
– |
Predicts mood metadata using synth embeddings from multiple modalities. |
Synopsis (LLM) |
– |
Generates a movie synopsis using a transcript and LLM. |
NSFW / Nudity Detection |
– |
Classifies whether video frames contain nudity. NSFW is an alias. |
Profanity |
– |
Predicts whether text contains offensive language. Based on off-the-shelf model. |
Language Detection |
– |
Detects primary spoken language based on transcripts. |
Speech Translation |
– |
Performs speech transcription and translation into English. |
Anime Character Recognition |
– |
Runs feature extraction for detected anime faces. |
Anime Character Detection XS |
– |
Facial detection for anime/cartoon content. |
Anime Character Portraits |
– |
Extracts facial portraits of recognized anime characters. |
Backdropness |
– |
Predicts if an image is suitable for use as a thumbnail. |
Emb-Inc / Emb-VGG / Emb-Synth |
– |
Extracts and synthesizes visual/audio embeddings for fingerprinting. |
Emb-Synth-NPY |
– |
Converts synth embeddings to numpy arrays for downstream use. |
Keyword-Joiner / Keyword-Mood / Keyword-NoMood |
– |
Filter or combine keyword/mood outputs. |
Speech-Lite |
– |
Voice Activation Detection for determining speech presence per second. |
Shot-Sim |
– |
Scene boundary detection based on color histogram deltas. |
Scenes |
– |
Extracts image statistics for determining shot boundaries. |
Scenes-Boundary |
– |
Predicts scene boundaries in videos. |
Seg-Delta & VAD Models |
– |
Segment-based models for emotion, keyword, and text embedding extraction. |
Text Detection / Selection / Keywords |
– |
NLP models using LLaMA or Sent2Vec to identify relevant keywords or classify text frames. |
Transcript Segmentation |
– |
Breaks long transcripts into semantically meaningful segments. |
Technical Terms
Term |
Acronym |
Definition |
Amazon Machine Image |
AMI |
A virtual machine image built per customer that includes packaged Feature Extractors (FEs). Used in edge or isolated environments. |
AWS / GCP |
– |
Amazon Web Services / Google Cloud Platform. Cloud infrastructure for compute, storage, and AI pipelines. |
Bitrate |
– |
The amount of data processed per unit of time in video/audio streams. |
Bucket |
– |
A cloud storage container (e.g., AWS S3, GCP Storage) for storing media assets. |
Captions / Subtitles |
– |
On-screen text of dialogue and non-verbal cues for accessibility or translation. |
Codecs |
– |
Compression algorithms for encoding and decoding video files (e.g., H.264, VP9). |
Container Format |
– |
A wrapper for bundling video, audio, and metadata in a single file (e.g., MP4, MKV). |
Dubbing / Localization |
– |
Adaptation of content into new languages, including audio replacement and metadata translation. |
Embedding |
– |
A numeric vector representing video, audio, or text for ML models. Enables classification and similarity search. |
Feature Extractor |
FE |
Client-side module that extracts characteristics from video assets (e.g., thumbnails, credits). Examples: |
Feature Processor |
FP |
Component that processes extracted data and produces structured results (e.g., mood tags, scenes). |
Frame Rate (FPS) |
– |
Frames per second of a video stream. Common values include 24, 30, and 60. |
ProRes / DNxHD |
– |
Professional video codecs used for high-fidelity editing workflows. |
SDR / HDR |
– |
Standard vs. High Dynamic Range technologies impacting video brightness and color. |
Transcoding |
– |
Conversion of video formats/resolutions to support multi-device delivery. |
UX (User Experience) |
– |
The perceived quality of user interaction with a platform, often enhanced via metadata and smart previews. |
Directed Acyclic Graph |
DAG |
A task graph with no loops. Defines execution flow of feature extraction and processing. |
Platform & Portal
Term |
Acronym |
Definition |
Unified Portal |
UP |
The main Vionlabs platform interface used by customers for uploads, settings, and job tracking. |
Just-in-Time |
JIT |
Processing strategy where outputs (e.g., thumbnails, previews) are generated at request time using live query parameters. Enables flexibility and scale. |
3rd Party Systems
Term |
Acronym |
Definition |
Media Asset Management |
MAM |
A system for cataloging, storing, and retrieving video and audio libraries in professional workflows. |
Video Content Management System |
VCMS |
Manages the full lifecycle of video content including ingest, metadata tagging, transcoding, and publishing. |