Về trang chủ

Word Embeddings Là Gì?

Nguồn: https://www.youtube.com/watch?v=hVM8qGRTaOA

Tác giả: Under The Hood

Ngày xuất bản: 2025-02-22T00:00:00

Length: 19:32

Tóm tắt nội dung

Mô tả

#word2vec #llm

Converting text into numbers is the first step in training any machine learning model for NLP tasks. While one-hot encoding and bag of words provide simple ways to represent text as numbers, they lack semantic and contextual understanding of words, making them unsuitable for NLP tasks like language translation and text generation. Embeddings help represent words as vectors that capture their semantic meaning.

In this video, I provide a detailed explanation of embeddings and popular embedding techniques like Word2Vec, along with custom embeddings used in Transformer architectures for language generation and other NLP tasks.

Watch Videos in Understanding Large Language Model:

1) Introduction to Large Language Model

https://youtu.be/NLOBYtfdxuM?si=PyVqqNLFsbRPvBI6

2) Preparing dataset and Tokenization

https://youtu.be/bNjVxUDZQfM?si=81fHwXskdl9abdZy

Timestamp:

0:00 - Intro

0:20 - Representing image into numbers

0:54 - Representing text into numbers

2:20 - One Hot Encoding

3:40 - Bag of Words (Unigram, Bigram and N-Gram)

4:59 - Semantic and Contextual Understanding of text

6:28 - Word Embeddings

9:44 - Visualizing Word2Vec Embeddings

10:30 - Word2Vec Training (CBOW and Skip-Gram)

14:46 - Embedding Layer in Transformer Architecture

17:16 - Positional Encoding

18:46 - Outro

Efficient Estimation of Word Representations in Vector Space:

Word2Vec Paper: https://arxiv.org/abs/1301.3781

Visualize Word2Vec Embeddings Here:

https://projector.tensorflow.org/

Dịch Vào Lúc: 2025-06-08T05:48:47Z

Video Đề Xuất

09:33

Tòa án thương mại Hoa Kỳ chặn thuế quan rộng rãi của Trump | BBC News - BBC News

03:02

Deniro Farrar - Spook By The Door (Video Âm nhạc Chính thức) - Deniro Farrar

06:11

15 tính năng framework JS mới điên rồ mà bạn chưa biết - Fireship

05:52

Đừng cố gắng học thuộc từ vựng trong một ngôn ngữ mới - Steve Kaufmann - lingosteve

14:09

Lossless Scaling 3 CẬP NHẬT LỚN! Dễ dàng Tăng Gấp Đôi Hoặc Gấp Ba FPS Của Bạn! - ETA PRIME

15:19

Sử dụng VS Code để Viết và Triển khai Azure Functions - John Savill's Technical Training

12:05

Cách Sử Dụng Cài Đặt Sẵn Âm Thanh Trên BandLab | Tăng Tốc Quy Trình Sản Xuất Của Bạn Với Chuỗi Hiệu Ứng Tùy Chỉnh - BandLab

16:13

Service Workers: Tại sao không có nhiều nhà phát triển sử dụng Viên ngọc ẩn này? - Monsterlessons Academy

39:38

AI sẽ tác động đến ngành phát triển game như thế nào? - dBs Institute

08:45

Giải mã Cuộc đời đầy Sóng gió của Buntaro trong Shogun - ChatSamurai - The journey of a Ronin

43:21

PostgreSQL ở mức 10 TB trở lên - Edument

07:03

Tiêu chuẩn Mã hóa Dữ liệu (DES) - Các Câu hỏi Đã Giải - Neso Academy