OpenAI Embeddings (and Controversy?!)

#mlnews #openai #embeddings OpenAI launches an embeddings endpoint in their API, providing high-dimensional vector embeddings for use in text similarity, text search, and code search. While embeddings are universally recognized as a standard tool to process natural language, people have raised doubts about the quality of OpenAI’s embeddings, as one blog post found they are often outperformed by open-source models, which are much smaller and with which embedding would cost a fraction of what OpenAI charges. In this video, we examine the claims made and determine what it all means. OUTLINE: 0:00 - Intro 0:30 - Sponsor: Weights & Biases 2:20 - What embeddings are available? 3:55 - OpenAI shows promising results 5:25 - How good are the results really? 6:55 - Criticism: Open models might be cheaper and smaller 10:05 - Discrepancies in the results 11:00 - The author’s response 11:50 - Putting things into perspective 13:35 - What about real world data? 14:40

12 views