Sử dụng GPU trong Cloud Run

Tác giả: Google Cloud Tech
Ngày xuất bản: 2024-10-03T00:00:00
Length: 08:44

Sign up for the preview → https://goo.gle/3NnobXv

GPU best practices → https://goo.gle/4elRpBE

Run LLM inference on Cloud Run GPUs with Ollama → https://goo.gle/3BwN6F1

Cloud Run, known for its scalability, now incorporates GPUs, ushering in a new era for machine learning inference. Join Googlers Martin Omander and Wietse Venema as they provide a practical demonstration of deploying Google's Gemma 2, an open-source large language model, through Ollama on Cloud Run.

Chapters:

0:00 - Intro

0:22 - Google Vertex AI vs GPUs with Cloud Run

1:12 - AI app architecture

2:04 - [Demo] Deploying Ollama API

3:26 - [Demo] Testing the deployment

5:28 - [Demo] Build & deploy the front end

6:02 - How do GPUs scale on Cloud Run?

6:34 - Where are Gemma 2 model files stored?

7:12 - Getting started with GPUs in Cloud Run

More Resources:

Cloud Run pricing → https://goo.gle/3BeMhAD

Watch more Serverless Expeditions → https://goo.gle/ServerlessExpeditions

Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech

#ServerlessExpeditions #GoogleCloud

Speaker: Martin Omander, Wietse Venema

Products Mentioned: Cloud Run, Gemma

Dịch Vào Lúc: 2025-04-01T02:18:34Z

Yêu cầu dịch (Một bản dịch khoảng 5 phút)

Phiên bản 3 (ổn định)

Tối ưu hóa cho một người nói. Phù hợp cho video chia sẻ kiến thức hoặc giảng dạy.

Video Đề Xuất