Skip to main content
Vision Agents logo
Stream Video provides a Vision Agents, an open-source Video AI framework built from the ground up to enable developers to build low-latency voice and vision applications running on the edge. Cartesia is available as an official text-to-speech (TTS) plugin. Their “Simple Agent” GitHub example or their voice and video guides are great for getting started.

Demo

Vision Agents Cartesia Demo

Try out the Simple Agent Cartesia demo.