Skip to main content

Architecture

Continuum consists of three entities: The attestation service, the Continuum CLI, and one or more workers. This page illustrates how these components interact with each other and provides more detail on their respective roles.

Attestation service & CLI

The attestation service (AS) ensures security of the workers by verifying their attestation statements. It imposes policies on the attestation statements and distributes prompt encryption secrets provided by the admin to the workers. The AS is hosted in a Confidential Virtual Machine (CVM). The admin establishes trust in the AS during Continuum's initialization by using the Continuum CLI. The initialization consists of three steps:

  1. Verification of the attestation report of the AS against the AS policy defined in the manifest.
  2. Setting the inference secrets.
  3. Setting the manifest.

The manifest defines the requirements for the specs of each CVM. The requirements are enforced during verification. When initialized, workers will start registering with the AS by providing their attestation statements to the AS. Only after successful verification does a worker receive the secrets necessary for decrypting the requests for its hosted workloads.

Worker

The worker CVMs host the workload containers and are responsible for their deployment and security. The deployment is handled through the worker API, while security is ensured through the sandbox and encryption proxy.

Worker API

The worker provides an HTTP API to manage, i.e., start and stop, workload containers. For more info on the workload configuration, refer to the Configuration page. Continuum hosts workloads in a sandbox and mediates network traffic through an encryption proxy. There exists a dedicated proxy instance per workload.

Sandbox

The workload, i.e., the container image provided by the user, is started in a gVisor sandbox. The sandbox isolates the workload from the host by handling system calls in a userspace kernel and restricting the system calls the workload can do. Furthermore, the sandbox blocks network traffic. These mechanisms protect the deployment from leaking any information to the model provider.

Encryption Proxy

Attached to each workload is a proxy container, which is the only connection of the workload to the outside world. The proxy decrypts incoming user requests, securely forwards them to the sandbox, re-encrypts the responses from the workload, and sends them back to the user. The proxy supports different inference API adapters to handle encryption of different APIs, such as OpenAI or Triton Generate.

Workflow

The below diagram shows the steps and interactions between the different components to secure prompts for an inference workload.

Continuum Architecture

Continuum differentiates between admins and users. An admin is responsible for initially establishing trust in the system by verifying the AS through the CLI. The admin is also entrusted with the inference secrets used for prompt encryption. A user is a consumer of the workload. The user is expected to share their inference secrets with the Continuum deployment and to trust the admin.

At first, the admin must verify the integrity of the attestation service (AS) through the CLI. After successful verification, the CLI sends the inference secrets to the AS and sets the manifest.

Each worker CVM has a discovery mechanism to register with the AS. The registration involves a verification of the worker's attestation report by the AS. Upon success, the AS provides inference secrets to the worker.

The admin can then run an inference workload through the worker API, which provisions the encryption proxy and sandbox. The admin needs to decide which entities are allowed to access the worker API.

Next, the user can send encrypted prompts to the service, which are received by the encryption proxy. The proxy is configured with the respective inference API adapter and secrets to decrypt the prompt and securely forward the request to the sandbox. The proxy then encrypts the response from the inference server in the sandbox and sends it back to the user.