Technical docs

Deployment & Data Residency

Overview

Alloy can run as a fully managed service or be self-hosted on your own infrastructure. This document explains the deployment options available to you across two dimensions: where your infrastructure and data live, and where model (LLM) calls run. Together they let you balance speed of setup against control over your data.

Where your infrastructure and data live

Choose the level of isolation that matches your security and compliance requirements.

OptionWhat it isWho operates itAvailability
Shared CloudRuns on Alloy's shared, multi-tenant infrastructureAlloyAll plans
Dedicated VPCIsolated network boundary for your organization; runtime and data are not shared with other customersAlloyEnterprise
Bring Your Own Cloud (BYOC)The full Alloy platform is deployed inside your cloud account (AWS, GCP, or Azure); your business data stays in your cloud accountAlloy (remotely, in your cloud)Enterprise
On-PremisesAlloy ships as a Docker container you install on your own servers or data center; no dependency on any Alloy-operated infrastructureYou (self-managed)Enterprise

Dedicated VPC, BYOC, and On-Premises are available on Enterprise plans.

For every option except On-Premises, Alloy manages provisioning, scaling, and monitoring — so increasing data control does not add operational burden on your team. With On-Premises, your team operates the platform.

Model (LLM) calls

These options govern the LLM calls Alloy makes itself — both when running Alloy-managed agents and when a workflow calls a model. You control where those calls run.

External agents connected to Alloy (Claude, Codex, Cursor, and similar) run on their own LLM endpoints.

All options are available on all plans.

OptionWhere calls runKeys & billingWhat reaches the endpoint
Alloy-Managed LLMModels and infrastructure provided by Alloy (default)AlloyAll model-visible data is processed through Alloy's infrastructure and model providers
Your Own ProviderA commercial provider you connect with your own keys — AWS Bedrock, Google Vertex, Azure OpenAI, or an aggregator such as OpenRouterYou (your keys / account)All model-visible data is sent to the provider you choose, under your account and contracts — a public endpoint outside your network
Self-Hosted ModelsOpen-weight models you run on your own hardware via a local inference server — Ollama, vLLM, LM Studio, or TGIYou (your hardware)All model-visible data stays inside your environment; nothing is sent to an external provider

How the two dimensions work together

  • Self-Hosted Models pair naturally with BYOC or On-Premises — both your data and your model calls stay inside your environment, with nothing sent to an external provider.
  • Your Own Provider keeps billing and governance under your account, but model calls still run on a public endpoint outside your network.
  • Shared Cloud uses the Alloy-Managed LLM by default for the simplest possible start.
  • Self-Hosted Models require Alloy to reach your inference server, so with Shared Cloud or Dedicated VPC that endpoint must be network-reachable from Alloy.

Start building your AI team