Chúng tôi không thể tìm thấy kết nối internet
Đang cố gắng kết nối lại
Có lỗi xảy ra!
Hãy kiên nhẫn trong khi chúng tôi khắc phục sự cố
Sử dụng GPU trong Cloud Run
Sign up for the preview → https://goo.gle/3NnobXv
GPU best practices → https://goo.gle/4elRpBE
Run LLM inference on Cloud Run GPUs with Ollama → https://goo.gle/3BwN6F1
Cloud Run, known for its scalability, now incorporates GPUs, ushering in a new era for machine learning inference. Join Googlers Martin Omander and Wietse Venema as they provide a practical demonstration of deploying Google's Gemma 2, an open-source large language model, through Ollama on Cloud Run.
Chapters:
0:00 - Intro
0:22 - Google Vertex AI vs GPUs with Cloud Run
1:12 - AI app architecture
2:04 - [Demo] Deploying Ollama API
3:26 - [Demo] Testing the deployment
5:28 - [Demo] Build & deploy the front end
6:02 - How do GPUs scale on Cloud Run?
6:34 - Where are Gemma 2 model files stored?
7:12 - Getting started with GPUs in Cloud Run
More Resources:
Cloud Run pricing → https://goo.gle/3BeMhAD
Watch more Serverless Expeditions → https://goo.gle/ServerlessExpeditions
Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech
#ServerlessExpeditions #GoogleCloud
Speaker: Martin Omander, Wietse Venema
Products Mentioned: Cloud Run, Gemma
Dịch Vào Lúc: 2025-04-01T02:18:34Z