Mô hình Mixture of Nested Experts của Google: Giải pháp thay thế hiệu quả cho MoE?

Tác giả: AI Papers Academy
Ngày xuất bản: 2024-08-11T00:00:00
Length: 07:37

In this video, we dive into a recent research paper by Google, titled: "Mixture of Nested Experts: Adaptive Processing of Visual Tokens". While standard Mixture of Experts (MoE) is successfully applied in LLMs, and also in computer vision, to increase computational cost without a proportional increase to model size, it comes with a large memory footprint. The Mixture of Nested Experts (MoNE) which we review in this video tackles that drawback. Mixture of Nested Experts is built on top of the Vision Transformer (ViT) architecture, and offers a dramatic performance improvement, by leveraging the fact that images naturally contain a large amount of information redundancy. So, while ViT (also with MoE), allocates its full compute power for each token, Mixture of Nested Experts (MoNE) learns to allocate compute power to tokens based on their importance.

Watch the video to learn more.

Paper page - https://arxiv.org/abs/2407.19985

Mixture of Experts (MoE) Video - https://youtu.be/kb6eH0zCnl8

Post - https://aipapersacademy.com/mixture-of-nested-experts/

Original Mixture-of-Experts paper review - https://aipapersacademy.com/mixture-of-experts/

-----------------------------------------------------------------------------------------------

✉️ Join the newsletter - https://aipapersacademy.com/newsletter/

👍 Please like & subscribe if you enjoy this content

-----------------------------------------------------------------------------------------------

Chapters:

0:00 Introduction

1:20 MoNE Illustration

4:36 MoNE Diagram

5:47 Results

Dịch Vào Lúc: 2025-03-02T03:56:47Z

Yêu cầu dịch (Một bản dịch khoảng 5 phút)

Phiên bản 3 (ổn định)

Tối ưu hóa cho một người nói. Phù hợp cho video chia sẻ kiến thức hoặc giảng dạy.

Video Đề Xuất