Can we learn joint representations between vision and language?
We can!
This video explains VideoBERT, an algorithm that exploit the ideas of BERT to learn such deep representations.
The learned vision-language model can be used to solve many downstream tasks.
Enjoy the video!
Papers cited in the video:
- VideoBERT: A Joint Model for Video and Language Representation Learning
#VideoBERT #languagemodels #BERT #neuralnetworks #deeplearning #nlp #computervision
37 views
25
4
3 weeks ago 00:02:11 1
NELSON EDDY SINGS MY NAME IS JOHN WELLINGTON WELLS GILBERT & SULLIVAN 1942
3 months ago 00:08:50 4
Afro cuban jazz suite - Chico O´Farrill
3 months ago 00:01:53 3
John Carpenter’s Toxic Commando - Reveal Trailer | Summer Game Fest 2023
3 months ago 00:11:21 1
Saltatio Mortis - Darkenguard feat. Blind Guardian (Official Video)
3 months ago 00:13:08 1
⚠️15 Most Dangerous Motorcycles Only for Experienced Riders❗❗ 2024/25
3 months ago 00:02:38 4
Euclid’s 208-Gigapixel glimpse into the Universe
3 months ago 00:13:57 1
Берт Хеллингер “О некоторых помогающих профессиях“/ Bert Hellinger #расстановки
3 months ago 00:05:33 1
Magic Pot | Bedtime Stories for Kids in English | Fairy Tales
3 months ago 00:09:05 1
Melissa Gilbert has a new Fabric line
3 months ago 01:55:08 1
Liquor mixed by Nuvertal
3 months ago 00:12:04 1
2025 New Restored 40-Year-Old Classic Motorcycles
3 months ago 00:04:50 1
. | Stephanie Trick & Paolo Alderighi
3 months ago 00:55:57 1
Dr. Gilbert Doctorow : BRICS and the Conflicts in Ukraine and Middle East