Beyond Keywords: Image similarity search in Azure Cosmos DB for PostgreSQL | Python Data Science Day (Дата оригинальной публикац

Vector search, also known as vector similarity search, is a method that helps you find similar items based on their content rather than exact matches on properties like keywords, tags, or other metadata, as keyword-based search systems do. It leverages machine learning to capture the meaning of data, allowing you to find similar items based on their content. The key idea behind vector search is the translation of unstructured data, such as text, images, videos, and audio, into high-dimensional vectors (also known as embeddings) and the application of nearest neighbor algorithms to find similar data. In this quickstart session, we will work together to build an image similarity search system utilizing Python, Azure Cosmos DB for PostgreSQL, and pgvector, an open-source vector similarity search extension for PostgreSQL. We will explore the process of generating vector embeddings using the Azure AI Vision multi-modal embeddings API and enabling the pgvector extension. We will then discuss the exact and approximate nearest neighbor search and use Azure Cosmos DB for PostgreSQL for storing and querying vector data. Chapters: 00:00 Image similarity search in Azure CosmosDB for PostgreSQL 00:56 Why vector search? 01:49 Agenda 02:14 Turn data into vectors 03:02 Project the vectors onto the 2D vector space 03:37 How to measure if 2 vectors are simlar 03:56 Vector search workflow 04:34 Vector search in PostgreSQL 05:01 Create a table to store embeddings 05:34 Query embeddings 06:01 Demo 07:00 Vector search strategies 08:21 Create an IVFFlat index in pgvector 09:30 Demo 10:01 Resources Resources: Project - Github Repository: pgvector - Github Repository: Vectors on Azure Cosmos DB for PostgreSQL: Azure AI Vision multimodal embeddings APIs: Survey Python at Microsoft Cloud Skills Challenge - through April 15, 2024 GitHub codespaces VS Code Release notes Featuring: Foteini Savvidou, Software Engineer, Microsoft AI MVP (@SavvidouFoteini)
Back to Top