Technical docs

Deployment & Data Residency

Overview

Alloy can run as a fully managed service or as a full-platform deployment inside infrastructure controlled by your organization. This document explains the deployment options across two dimensions: where your infrastructure and data live, and where model (LLM) calls run. Together they let you balance speed of setup against control over your data.

In this documentation, customer-hosted deployment means Bring Your Own Cloud (BYOC) or On-Premises. Both place the full Alloy platform inside your environment. Alloy operates BYOC remotely in your cloud account; your team operates On-Premises on your own servers or data center.

Where your infrastructure and data live

Choose the level of isolation that matches your security and compliance requirements.

Option	What it is	Who operates it	Availability
Shared Cloud	Runs on Alloy's shared, multi-tenant infrastructure	Alloy	All plans
Dedicated VPC	Isolated network boundary for your organization; runtime and data are not shared with other customers	Alloy	Enterprise
Bring Your Own Cloud (BYOC)	The full Alloy platform is deployed inside your cloud account (AWS, GCP, or Azure); your business data stays in your cloud account	Alloy (remotely, in your cloud)	Enterprise
On-Premises	Alloy ships as a Docker container you install on your own servers or data center; no dependency on any Alloy-operated infrastructure	You (self-managed)	Enterprise

Dedicated VPC, BYOC, and On-Premises are available on Enterprise plans.

For every option except On-Premises, Alloy manages provisioning, scaling, and monitoring. With On-Premises, your team operates the platform.

Requesting the local deployment bundle

The [Enterprise page](https://alloy.cx/enterprise) includes a `Download bundle` action for local deployment. It opens `Get the local deployment bundle` and requires a work email address.

Submitting the form sends `POST /api/download-request` with the email address. A successful request does not download the bundle immediately. The confirmation says Alloy will email the bundle and setup instructions shortly. Invalid email input is rejected; if the request cannot reach Alloy or the service is unavailable, the dialog directs the visitor to contact `[email protected]` for the bundle.

Customer-hosted installation model

The installer, bootstrap, and system-content behaviors in this section apply to customer-hosted deployments: BYOC and On-Premises. These are deployment-time operator tasks, not settings in the Alloy product UI and not part of normal customer organization setup in Shared Cloud or Dedicated VPC.

A customer-hosted operator can enable `ONE_ORGANIZATION_MODE=true` so newly registered users share one non-system organization. In this mode, usage paid through Alloy system provider keys still produces token-history and calculated-cost records, but it does not reduce the shared organization's `available_tokens` balance. An unset value or `false` keeps normal balance deduction for non-system organizations using system provider keys.

The current source distribution provides interactive environment installers for the backend, frontend, webchat widget, authentication service, core Docker infrastructure, and optional Google Workspace MCP services.

The backend root installer can orchestrate component configuration, install dependencies, start the database services, create the database, and apply migrations.
After configuration, the root installer starts the core, authentication, and optional Google Workspace MCP Docker Compose stacks. A stack startup failure is reported as a warning so setup can finish with follow-up instructions.
Component installers preserve values that are already configured, back up an existing environment file, generate values that can be generated safely, and prompt for values that cannot be derived.
Installers can be rerun to fill newly added or previously omitted settings.
The core Docker stack includes PostgreSQL, Redis, Qdrant, application and public realtime services, nginx, and a MinIO S3-compatible object store.
In the customer-hosted development layout, nginx proxies `/widget/` on the API host to the webchat widget development service.
Nginx configuration and local TLS certificates are rendered from tracked templates. Generated environment files, rendered configuration, and generated certificates are intentionally not tracked.
The optional auth-server LDAP blueprint is disabled by default. When enabled, it connects Authentik to an operator-provided LDAP directory; Alloy does not bundle an LDAP server.
LDAP directory shape is configurable for OpenLDAP or Active Directory through user/group search DNs, user/group object filters, group membership field, and object uniqueness field. Unset values use OpenLDAP defaults; the Active Directory preset searches the full base DN, uses AD object filters, and uses `objectSid` for uniqueness.
The shared LDAP identity mapping reads OpenLDAP attributes first and falls back to Active Directory equivalents such as `sAMAccountName`, `displayName`, and `userPrincipalName`.
A direct username/password LDAP form also requires an enabled auth-provider entry of type `ldap-single` that points to the configured Authentik flow.
The Google source is created only when Google credentials are configured, so LDAP-only Authentik deployments can provision without a Google source.
The backend brand name is configurable for customer-hosted branding and defaults to `Alloy`. It is used in Hosted MCP initialization instructions and generated Microsoft Teams manifest developer metadata.
The frontend build accepts a configurable display brand name and domain. `NEXT_PUBLIC_BRAND_NAME` defaults to `Alloy` and replaces the product name in page titles, metadata descriptions, OpenGraph text, login and invite page copy, the API Keys card label, and the PWA manifest. `NEXT_PUBLIC_BRAND_DOMAIN` defaults to `alloy.cx` and replaces the app name derived for page-title suffixes and the installable-app manifest.
Custom brand favicon and PWA icon assets placed under `public/fav/custom/` are preferred over the environment-specific default icon set at build time.
The frontend build accepts configurable comma-separated development origins and HTTPS remote-image hostnames. Remote images outside the configured hostname list are not eligible for Next.js image optimization.
Frontend builds can set `NEXT_OUTPUT=standalone` to emit the self-contained Next.js standalone server output for source-free production deployment. Without that value, the normal build output is unchanged.
The webchat widget build accepts a configurable footer brand URL and displayed brand domain. Defaults remain `https://alloy.cx` and `alloy.cx`.
The webchat widget dev server allows localhost, hostnames parsed from its configured API URLs, and optional explicit comma-separated hosts.

Platform bootstrap and system documentation

Customer-hosted platform installation and updates include operator-only instance bootstrap:

Bootstrap creates or resolves the system organization and global Ally even when admin selection is skipped or no user has registered yet.
Bootstrap stores the global Ally setting and seeds missing prompt, model, and fallback settings without overwriting configured values.
System-content synchronization publishes bundled `business docs`, `integrations`, and `tech docs` categories under the system organization's `Team space`, queues the published files for indexing, and synchronizes the global Ally instructions.
Synchronization replaces those shipped category folders but leaves other `Team space` content untouched.
These are instance-level deployment operations for BYOC and On-Premises, separate from normal customer-organization creation in Shared Cloud and Dedicated VPC.
Alloy performs these operator tasks for BYOC. The customer deployment operator performs them for On-Premises.

For detailed operator behavior, see `operations-runtime.md`. For the defaults created by bootstrap, see `defaults.md`.

Model (LLM) calls

These options govern the LLM calls Alloy makes itself, both when running Alloy-managed agents and when a workflow calls a model. You control where those calls run.

External agents connected to Alloy (Claude, Codex, Cursor, and similar) run on their own LLM endpoints.

All options are available on all plans.

Option	Where calls run	Keys & billing	What reaches the endpoint
Alloy-Managed LLM	Models and infrastructure provided by Alloy (default)	Alloy	All model-visible data is processed through Alloy's infrastructure and model providers
Your Own Provider	A commercial provider you connect with your own keys — AWS Bedrock, Google Vertex, Azure OpenAI, or an aggregator such as OpenRouter	You (your keys / account)	All model-visible data is sent to the provider you choose, under your account and contracts — a public endpoint outside your network
Self-Hosted Models	Open-weight models you run on your own hardware via a local inference server — Ollama, vLLM, LM Studio, or TGI	You (your hardware)	All model-visible data stays inside your environment; nothing is sent to an external provider

How the two dimensions work together

Self-Hosted Models pair naturally with BYOC or On-Premises — both your data and your model calls stay inside your environment, with nothing sent to an external provider.
Your Own Provider keeps billing and governance under your account, but model calls still run on a public endpoint outside your network.
Shared Cloud uses the Alloy-Managed LLM by default for the simplest possible start.
Self-Hosted Models require Alloy to reach your inference server, so with Shared Cloud or Dedicated VPC that endpoint must be network-reachable from Alloy.