Chúng tôi không thể tìm thấy kết nối internet
Đang cố gắng kết nối lại
Có lỗi xảy ra!
Hãy kiên nhẫn trong khi chúng tôi khắc phục sự cố
Xây dựng một Voice RAG Agent thời gian thực bằng Gemma 3 & Nhân bản giọng nói
I just created a Real-time Voice RAG Agent!
(also cloned my voice in just 5 seconds)
Here's an overview of what this agent:
1. Listens to real-time audio
2. Transcribes it via AssemblyAI
3. Uses your docs (via LlamaIndex) to craft an answer
4. Speaks that answer back with Cartesia
Tech stack:
- Cartesia for SOTA text-to-speech
- AssemblyAI for speech-to-text
- LlamaIndex to power RAG
- LiveKit for orchestration
Why is Cartesia?
Cartesia enables you to generate seamless speech, power voice applications, and fine-tune your own voice models on the fastest real-time AI platform.
I used it to clone my own voice with just a 5-second clip and powered the agent with it.
You can find all the code and everything you need in this GitHub repo: https://github.com/patchy631/ai-engineering-hub/tree/main/rag-voice-agent
It's fairly easy to follow along and incase you're struck, feel free to raise an issue.
I'll be happy to help! :)
#ai #llm #agent #genai
Dịch Vào Lúc: 2025-04-06T08:54:29Z