V-JEPA của Meta AI - Mô hình dựa trên Video và Thị giác Máy tính Giống con người

Nguồn: https://www.youtube.com/watch?v=4X_26j5Z43Y

Tác giả: AI Papers Academy

Ngày xuất bản: 2024-02-27T00:00:00

Length: 11:35

Tóm tắt nội dung

Mô tả

In this video we dive into V-JEPA, a new vision models collection, created by Meta AI. V-JEPA stands for Video Joint-Embedding Predictive Architecture, and is part of the Meta AI's implementation of Yann LeCun's vision for a more human-like AI. In this video we dive deep into the researcher paper which presented V-JEPA, titled: "Revisiting Feature Prediction for Learning Visual Representations from Video". Additionally, we provide reminders for important information from I-JEPA, a previous Meta AI's JEPA model which is based on images, which will help to grasp how JEPA works for videos as well.

We start with a short background of what is the meaning of visual representations, also known as visual features or semantic embeddings. V-JEPA is trained using unsupervised learning using feature prediction, so we provide a short background for what is the meaning of feature prediction, which is different than pixels prediction. By then, we are ready to cover the JEPA framework, starting with the main idea, following with the details of both images with I-JEPA and videos with V-JEPA.

Both I-JEPA and V-JEPA models are based on Vision Transformers, which we may assume that you are familiar with in the video. We covered the details of vision transformers in the following video - https://youtu.be/NetSJM590Lo

We also have a previous video dedicated solely to I-JEPA with more details on the I-JEPA paper which we do not cover here - https://youtu.be/6bJIkfi8H-E

-----------------------------------------------------------------------------------------------

Paper page - https://ai.meta.com/research/publications/revisiting-feature-prediction-for-learning-visual-representations-from-video/

Meta AI's V-JEPA blog post - https://ai.meta.com/blog/v-jepa-yann-lecun-ai-model-video-joint-embedding-predictive-architecture/

Code - https://github.com/facebookresearch/jepa

Blog post - https://aipapersacademy.com/v-jepa/

-----------------------------------------------------------------------------------------------

✉️ Join the newsletter - https://aipapersacademy.com/newsletter/

👍 Please like & subscribe if you enjoy this content

Become a patron - https://www.patreon.com/aipapersacademy

We use VideoScribe to edit our videos - https://tidd.ly/44TZEiX

-----------------------------------------------------------------------------------------------

Chapters: