Build and connect

Talk with AI teammates by voice

Use realtime voice to let people speak with Ally or another AI teammate in a natural conversation, with text and file context when supported.

Where voice can be used

Alloy supports realtime voice sessions for public webchat, internal skills, internal AI teammates, and Ally.

This means voice can be used both inside the workspace and in customer-facing experiences, depending on how the AI teammate and channel are configured.

How voice sessions work

A voice session starts through a session creation request, then connects over websocket. Clients send microphone audio, and Alloy streams audio responses back.

Internal voice sessions are tied to a conversation. Public webchat voice can include contact context and conversation settings from the widget.

Models, voices, and validation

AI teammate voice setup depends on the model and voice catalog. Chat models, TTS models, STT models, realtime models, and voices have separate capabilities.

When a TTS model is selected, the teammate also needs a compatible voice from that provider. STT models must be speech-to-text capable. These rules keep the voice setup valid before a session starts.

Guardrails and limits

Current supported locales are `en-US`, `ru-RU`, and `es-ES`, with `en-US` as the fallback.

Voice sessions can also process JSON user messages with attachments. Attachment guardrails currently allow up to 5 files, 10 MB each, from supported file types such as images, PDF, plain text, XML, and SQLite.

Frequently asked questions

Can customers use voice in the webchat widget?+

Yes, when voice is enabled for the configured public webchat experience.

Can Ally use voice?+

Yes. Alloy has an internal voice session path for Ally.

Do all models support voice?+

No. Voice depends on model capabilities. TTS, STT, realtime support, provider, and voice compatibility all matter.

Start building your AI team