This guide will help you set up a real-time AI voice assistant using the SuperU SDK and WebSockets. You’ll be able to speak to an AI and get instant, natural-sounding replies—perfect for customer support, call centers, or adding a voice assistant to any app.
Introduction
Pluto lets you build a real-time conversational AI voice assistant for your website or app. It uses your microphone to capture audio, streams it to an AI model, and plays back the AI’s spoken response—all in just a few hundred milliseconds.
Quick Overview
Goal: Enable real-time voice conversations with AI.
Use Cases:
Customer support over AI calls
Voice assistant for your website/app
Cold calling with AI
Call center automation
Tech Stack:
Python
PyAudio
WebSockets
SuperU SDK
Input: Microphone audio
Output: AI-generated speech
Prerequisites
Before you start, make sure you have:
Python installed on your computer
Internet connection
A microphone and speakers/headphones
An account on SuperU
Getting Started
Sign Up and Get Your API Key
Go to dev.superu.ai and sign up for an account
After signing up, visit your dashboard
Copy your API key from the dashboard
For security, store your API key in an environment variable instead of hard-coding it in your code.
Setup & Configuration
Install Dependencies
Open your terminal and run the appropriate command for your operating system: # Windows
pip install pyaudio websockets superu
# macOS/Linux
pip3 install pyaudio websockets superu
Configure Audio
The code will use your system’s default microphone and speakers. No extra setup is needed, but make sure your devices are working properly. Test your microphone and speakers before running the assistant to ensure optimal performance.
Running the Demo
Copy and Run the Code
Create a new Python file (e.g., pluto_voice_assistant.py
) and paste the following code. Replace <YOUR_API_KEY>
with your actual API key: import asyncio
import pyaudio
import websockets
import uuid
import base64
import json
import superu
FORMAT = pyaudio.paInt16
CHANNELS = 1
SEND_SAMPLE_RATE = 16000
CHUNK_SIZE = 1024
pya = pyaudio.PyAudio()
def get_default_input_device_index ():
for i in range (pya.get_device_count()):
dev = pya.get_device_info_by_index(i)
if dev[ 'maxInputChannels' ] > 0 :
print ( f "Using device: { dev[ 'name' ] } (Index: { i } )" )
return i
raise RuntimeError ( "No input device found." )
mic_index = get_default_input_device_index()
async def send_audio ( stream , websocket , streamId ):
await websocket.send(json.dumps({ "event" : "start" , "start" : { "streamId" : streamId}}))
while True :
audio_chunk = await asyncio.to_thread(stream.read, CHUNK_SIZE )
payload = base64.b64encode(audio_chunk).decode( 'utf-8' )
await websocket.send(json.dumps({ "event" : "media" , "media" : { "payload" : payload}}))
async def receive_messages ( websocket ):
output_stream = await asyncio.to_thread(
pya.open, format = FORMAT , channels = CHANNELS , rate = SEND_SAMPLE_RATE , output = True )
while True :
try :
response = json.loads( await websocket.recv())
if response.get( "event" ) == "playAudio" :
audio_data = base64.b64decode(response[ "media" ][ "payload" ])
await asyncio.to_thread(output_stream.write, audio_data)
except websockets.exceptions.ConnectionClosed:
print ( "WebSocket connection closed." )
break
async def listen_and_send ( uri , streamId ):
stream = await asyncio.to_thread(
pya.open, format = FORMAT , channels = CHANNELS , rate = SEND_SAMPLE_RATE ,
input = True , input_device_index = mic_index, frames_per_buffer = CHUNK_SIZE )
async with websockets.connect(uri) as websocket:
print ( "Connected to WebSocket. Streaming audio..." )
try :
await asyncio.gather(
send_audio(stream, websocket, streamId),
receive_messages(websocket))
except Exception as e:
print ( f "Error: { e } " )
finally :
stream.stop_stream()
stream.close()
pya.terminate()
print ( "Connection closed." )
if __name__ == "__main__" :
superu_client = superu.SuperU( '<YOUR_API_KEY>' )
First_message = "Hey there! Ready to explore some fascinating science today?"
System_prompt = """
You are a helpful and curious science assistant. Your job is to answer questions clearly, concisely, and in a way that's engaging for someone interested in science.
"""
pluto = superu_client.pluto.create_call(
first_message = First_message,
system_prompt = System_prompt,
voice_id = "90ipbRoKi4CpHXvKVtl0" # Anika - Customer Care Agent
)
asyncio.run(listen_and_send(pluto[ 'ws_url' ], pluto[ 'streamId' ]))
Run the Application
Execute your Python file using the appropriate command: # Windows
python pluto_voice_assistant.py
# macOS/Linux
python3 pluto_voice_assistant.py
You should see “Connected to WebSocket. Streaming audio…” in your terminal, indicating a successful connection.
Customizing Your Assistant
You can personalize your AI assistant by modifying several key components:
Change the First Message
The First_message
variable defines what the assistant says when the conversation starts.
First_message = "Hello! How can I help you today?"
Set this to any greeting or introduction that fits your use case.
Modify the System Prompt
The System_prompt
variable defines how the assistant should behave and respond.
System_prompt = """
You are a friendly and knowledgeable customer support assistant.
Help users with their questions about our products and services.
"""
Adjust this prompt to fit your specific use case, such as customer support, language learning, or technical help.
How It Works
The voice assistant operates through a real-time audio streaming process:
Audio Capture : Your microphone captures your voice
WebSocket Transmission : Audio is sent to the AI model via WebSocket connection
AI Processing : The AI listens, processes, and generates a response
Voice Synthesis : The AI’s response is converted to natural-sounding speech
Audio Playback : You hear the AI’s response through your speakers—all in real time!
Available Voices
You can choose from many different AI voices by changing the voice_id
parameter. Each voice has a unique ID and supports various languages.
Available Voices List of all available voice IDs with their description and tested out use cases.
Additional Resources
Congratulations! You now have a real-time AI voice assistant running on your machine.