Multimodal embeddings at scale: AI data lake for media and entertainment workloads
This post shows you how to build a scalable multimodal video search system that enables natural language search across large video datasets using Amazon Nova models and Amazon OpenSearch Service. You will learn how to move beyond manual tagging and keyword-based searches to enable semantic search that captures the full richness of video content. We demonstrate this at scale by processing 792,270 videos from two AWS Open Data Registry datasets: Multimedia Commons (787,479 videos, 37-second average) and MEVA […]