Glossary

Updated at December 18th, 2025

Table of Contents

Product Features Technical Terms Platform & Portal 3rd Party Systems

This glossary provides a unified reference for the core terminology used across the Vionlabs platform, APIs, and documentation.
It defines how we describe entities, processes, and concepts within the product, ensuring consistent understanding across engineering, product, and customer teams.

The terms are organized by domain — including Business Concepts, Product Features, Technical Terms, Platform Components, and 3rd-Party Systems — and cover both operational definitions and technical abbreviations.

Use this glossary to:

Clarify how platform entities like Catalog, Asset, Product, and FeatureCode relate to one another.
Understand the scope of processing terms used throughout the API and data model.
Reference acronyms and domain-specific terms when integrating or building on top of the Vionlabs platform.
This page is continuously updated as the platform evolves to include new capabilities, features, and technical definitions.

Business Domain

Term	Acronym	Definition
Account	–	A logical subdivision within an organization. Used to structure access, e.g., for subsidiaries, brands, or sandbox use cases.
Asset	–	A piece of digital content (video, audio, image) managed within a catalog or content library.
Catalog	–	A customer's media library. Contains metadata such as name and description and is the root entity for all processing. Linked 1:1 to an `Operation` record. Types: edge catalog, user upload catalog, demo onboarding catalog.
Operation	–	Metadata configuration for a catalog. Defines processing flags like `remote_extracting`, `user_upload`, `demo_onboarding`, `gfp_processing`, and `download_videos`.
ApiKey, ApiKeyPermission	–	Stores customer-facing API keys (sensitive info! it is preferred to encrypt it before handing over to customer) and their permissions. Keys are linked to a catalog and grant access to result APIs.
CatalogApiConfig	–	Per-catalog configuration for result APIs. Includes result versions, flags (`series_aggregation_enabled`, etc.), and webhook settings. Linked 1:1 with Catalog.
CatalogFeatureConfig	–	Manages active feature codes per catalog. Includes `priority` for processing order and `asset_filter` for filtering applicable asset types.
FeatureCode	FC	Internal identifier for a unit of work in the pipeline. Defines Docker image, feature type, dependencies, and scheduling metadata.
Inventory	–	Media items (movies, episodes, series) within a catalog. Each entry links to related processing jobs and status.
AssetFeatureStatus	–	Tracks feature processing status per asset. Serves as the backend unit behind the Job entity.
Job	–	Logical grouping of `AssetFeatureStatus` entries. Used in the admin UI to monitor processing workflows.
AvailableTask	–	Defines which product/feature versions are available for task creation via the Task API. Includes priority and activation status.
Product	–	A view combining `CatalogFeatureConfig` and `CatalogApiConfig`. Considered enabled when both config and feature code match.
UserCatalog	–	Many-to-many mapping between users and catalogs. Includes default catalog flag for UI display.
Operation History	–	Logs admin changes and configuration updates made in the Django Admin portal.
Reports	–	Analytics and billing data. Reports include ingestion volume, feature usage, processing time, and order-based billing events.
Customer / Organization	–	The end-user or company consuming Vionlabs' technology for AI-driven content operations.
Episode	–	A single unit in a serialized show.
Genre	–	A content category (e.g., Drama, Comedy) used for classification and recommendation.
IMDB / TMDB	–	Widely-used databases offering movie metadata, often used for enrichment or cross-reference.
Character Tracking	–	AI-based method for identifying and following characters through a video.
Intro / Recap / Credit Detection	–	AI-powered tagging of intros and credits to enable skipping or repurposing scenes.
Keywords	–	Descriptive tags derived from video/audio/text to improve discoverability and search relevance.
Movie	–	A full-length video asset analyzed with scene-level insights.
Mood	–	The emotional tone of a scene (e.g., sad, tense, joyful), generated via multimodal analysis.
Mood Tags	–	Metadata labels that describe the emotional content of a scene or title.
Series	–	A multi-episode content format, each installment processed individually or as part of a set.
Subtitles	–	Text version of spoken dialogue, usually in the original or a translated language, displayed on-screen.
Synopsis	–	A short AI-generated summary of a video’s story, used for previews and catalogs.
Thumbnail	–	A representative image, typically AI-selected, used as the visual cue for content in a catalog or interface.

Product Features

 
Term

Acronym

Definition

Adbreaks

–

Given time series of audio (vggish) and video (inception) features, predicts seconds where adbreaks should occur.

Adbreaks-Multi

–

Collects the different components for scene-boundary detection and aligns with speech to produce adbreaks. No inference performed.

Binge Markers

BM

AI-detected timestamps like intros, recaps, and credits that enable seamless binge-watching.

Contextual AdBreak

–

Ad placement at relevant moments in video content, based on mood and narrative context.

Fingerprint+

FP+

Enhanced fingerprinting using multimodal embeddings and context-aware metadata.

Jit Preview

JIT (JP)

Dynamically generated preview clips based on user query context. Generated only when requested.

Jit Thumbnail

JIT (JT)

Thumbnails rendered in real-time based on the most relevant moment at request time.

Metadata

–

Descriptive video data (e.g., mood, actors, keywords) used for search, sort, and recommendation.

Montage / Trailer

–

A sequence of selected video clips edited into a promotional or summary format.

Preview Clip

–

Short, auto-generated segment of a video optimized to encourage viewing.

Scene Search

–

Search functionality that enables users to find scenes based on content characteristics.

Character

–

Character tracking combining clustering, facial recognition, and classical object tracking algorithms.

Intro

–

Detects start and end times of the intro in a video based on multimodal features.

Recap

–

Detects start and end times of the recap in a video.

Credit

–

Detects start and end times of the credit section in a video.

Keyword

–

Predicts keywords and mood tags based on multimodal embeddings.

Mood

–

Predicts mood metadata using synth embeddings from multiple modalities.

Synopsis (LLM)

–

Generates a movie synopsis using a transcript and LLM.

NSFW / Nudity Detection

–

Classifies whether video frames contain nudity. NSFW is an alias.

Profanity

–

Predicts whether text contains offensive language. Based on off-the-shelf model.

Language Detection

–

Detects primary spoken language based on transcripts.

Speech Translation

–

Performs speech transcription and translation into English.

Anime Character Recognition

–

Runs feature extraction for detected anime faces.

Anime Character Detection XS

–

Facial detection for anime/cartoon content.

Anime Character Portraits

–

Extracts facial portraits of recognized anime characters.

Backdropness

–

Predicts if an image is suitable for use as a thumbnail.

Emb-Inc / Emb-VGG / Emb-Synth

–

Extracts and synthesizes visual/audio embeddings for fingerprinting.

Emb-Synth-NPY

–

Converts synth embeddings to numpy arrays for downstream use.

Keyword-Joiner / Keyword-Mood / Keyword-NoMood

–

Filter or combine keyword/mood outputs.

Speech-Lite

–

Voice Activation Detection for determining speech presence per second.

Shot-Sim

–

Scene boundary detection based on color histogram deltas.

Scenes

–

Extracts image statistics for determining shot boundaries.

Scenes-Boundary

–

Predicts scene boundaries in videos.

Seg-Delta & VAD Models

–

Segment-based models for emotion, keyword, and text embedding extraction.

Text Detection / Selection / Keywords

–

NLP models using LLaMA or Sent2Vec to identify relevant keywords or classify text frames.

Transcript Segmentation

–

Breaks long transcripts into semantically meaningful segments.

Technical Terms

 
Term

Acronym

Definition

Amazon Machine Image

AMI

A virtual machine image built per customer that includes packaged Feature Extractors (FEs). Used in edge or isolated environments.

AWS / GCP

–

Amazon Web Services / Google Cloud Platform. Cloud infrastructure for compute, storage, and AI pipelines.

Bitrate

–

The amount of data processed per unit of time in video/audio streams.

Bucket

–

A cloud storage container (e.g., AWS S3, GCP Storage) for storing media assets.

Captions / Subtitles

–

On-screen text of dialogue and non-verbal cues for accessibility or translation.

Codecs

–

Compression algorithms for encoding and decoding video files (e.g., H.264, VP9).

Container Format

–

A wrapper for bundling video, audio, and metadata in a single file (e.g., MP4, MKV).

Dubbing / Localization

–

Adaptation of content into new languages, including audio replacement and metadata translation.

Embedding

–

A numeric vector representing video, audio, or text for ML models. Enables classification and similarity search.

Feature Extractor

FE

Client-side module that extracts characteristics from video assets (e.g., thumbnails, credits). Examples: speech-v3, shot-sim-v3.

Feature Processor

FP

Component that processes extracted data and produces structured results (e.g., mood tags, scenes).

Frame Rate (FPS)

–

Frames per second of a video stream. Common values include 24, 30, and 60.

ProRes / DNxHD

–

Professional video codecs used for high-fidelity editing workflows.

SDR / HDR

–

Standard vs. High Dynamic Range technologies impacting video brightness and color.

Transcoding

–

Conversion of video formats/resolutions to support multi-device delivery.

UX (User Experience)

–

The perceived quality of user interaction with a platform, often enhanced via metadata and smart previews.

Directed Acyclic Graph

DAG

A task graph with no loops. Defines execution flow of feature extraction and processing.

Platform & Portal

 
Term

Acronym

Definition

Unified Portal

UP

The main Vionlabs platform interface used by customers for uploads, settings, and job tracking.

Just-in-Time

JIT

Processing strategy where outputs (e.g., thumbnails, previews) are generated at request time using live query parameters. Enables flexibility and scale.

3rd Party Systems

 
Term

Acronym

Definition

Media Asset Management

MAM

A system for cataloging, storing, and retrieving video and audio libraries in professional workflows.

Video Content Management System

VCMS

Manages the full lifecycle of video content including ingest, metadata tagging, transcoding, and publishing.

Term	Acronym	Definition
Adbreaks	–	Given time series of audio (vggish) and video (inception) features, predicts seconds where adbreaks should occur.
Adbreaks-Multi	–	Collects the different components for scene-boundary detection and aligns with speech to produce adbreaks. No inference performed.
Binge Markers	BM	AI-detected timestamps like intros, recaps, and credits that enable seamless binge-watching.
Contextual AdBreak	–	Ad placement at relevant moments in video content, based on mood and narrative context.
Fingerprint+	FP+	Enhanced fingerprinting using multimodal embeddings and context-aware metadata.
Jit Preview	JIT (JP)	Dynamically generated preview clips based on user query context. Generated only when requested.
Jit Thumbnail	JIT (JT)	Thumbnails rendered in real-time based on the most relevant moment at request time.
Metadata	–	Descriptive video data (e.g., mood, actors, keywords) used for search, sort, and recommendation.
Montage / Trailer	–	A sequence of selected video clips edited into a promotional or summary format.
Preview Clip	–	Short, auto-generated segment of a video optimized to encourage viewing.
Scene Search	–	Search functionality that enables users to find scenes based on content characteristics.
Character	–	Character tracking combining clustering, facial recognition, and classical object tracking algorithms.
Intro	–	Detects start and end times of the intro in a video based on multimodal features.
Recap	–	Detects start and end times of the recap in a video.
Credit	–	Detects start and end times of the credit section in a video.
Keyword	–	Predicts keywords and mood tags based on multimodal embeddings.
Mood	–	Predicts mood metadata using synth embeddings from multiple modalities.
Synopsis (LLM)	–	Generates a movie synopsis using a transcript and LLM.
NSFW / Nudity Detection	–	Classifies whether video frames contain nudity. NSFW is an alias.
Profanity	–	Predicts whether text contains offensive language. Based on off-the-shelf model.
Language Detection	–	Detects primary spoken language based on transcripts.
Speech Translation	–	Performs speech transcription and translation into English.
Anime Character Recognition	–	Runs feature extraction for detected anime faces.
Anime Character Detection XS	–	Facial detection for anime/cartoon content.
Anime Character Portraits	–	Extracts facial portraits of recognized anime characters.
Backdropness	–	Predicts if an image is suitable for use as a thumbnail.
Emb-Inc / Emb-VGG / Emb-Synth	–	Extracts and synthesizes visual/audio embeddings for fingerprinting.
Emb-Synth-NPY	–	Converts synth embeddings to numpy arrays for downstream use.
Keyword-Joiner / Keyword-Mood / Keyword-NoMood	–	Filter or combine keyword/mood outputs.
Speech-Lite	–	Voice Activation Detection for determining speech presence per second.
Shot-Sim	–	Scene boundary detection based on color histogram deltas.
Scenes	–	Extracts image statistics for determining shot boundaries.
Scenes-Boundary	–	Predicts scene boundaries in videos.
Seg-Delta & VAD Models	–	Segment-based models for emotion, keyword, and text embedding extraction.
Text Detection / Selection / Keywords	–	NLP models using LLaMA or Sent2Vec to identify relevant keywords or classify text frames.
Transcript Segmentation	–	Breaks long transcripts into semantically meaningful segments.

Product Features

Technical Terms

Platform & Portal

3rd Party Systems

Related Articles