Chúng tôi không thể tìm thấy kết nối internet
Đang cố gắng kết nối lại
Có lỗi xảy ra!
Hãy kiên nhẫn trong khi chúng tôi khắc phục sự cố
Mô hình Mixture of Nested Experts của Google: Giải pháp thay thế hiệu quả cho MoE?
In this video, we dive into a recent research paper by Google, titled: "Mixture of Nested Experts: Adaptive Processing of Visual Tokens". While standard Mixture of Experts (MoE) is successfully applied in LLMs, and also in computer vision, to increase computational cost without a proportional increase to model size, it comes with a large memory footprint. The Mixture of Nested Experts (MoNE) which we review in this video tackles that drawback. Mixture of Nested Experts is built on top of the Vision Transformer (ViT) architecture, and offers a dramatic performance improvement, by leveraging the fact that images naturally contain a large amount of information redundancy. So, while ViT (also with MoE), allocates its full compute power for each token, Mixture of Nested Experts (MoNE) learns to allocate compute power to tokens based on their importance.
Watch the video to learn more.
Paper page - https://arxiv.org/abs/2407.19985
Mixture of Experts (MoE) Video - https://youtu.be/kb6eH0zCnl8
Post - https://aipapersacademy.com/mixture-of-nested-experts/
Original Mixture-of-Experts paper review - https://aipapersacademy.com/mixture-of-experts/
-----------------------------------------------------------------------------------------------
✉️ Join the newsletter - https://aipapersacademy.com/newsletter/
👍 Please like & subscribe if you enjoy this content
-----------------------------------------------------------------------------------------------
Chapters:
0:00 Introduction
1:20 MoNE Illustration
4:36 MoNE Diagram
5:47 Results
Dịch Vào Lúc: 2025-03-02T03:56:47Z