Skip to main content
Version: 1.4

Overview

Continuum consists of two important parts: the server side and the client side. The server side hosts the AI service and processes prompts securely. The client side verifies the server, encrypts the prompts, and sends inference requests to the server. This page explains how these components interact and details their respective roles.

Server-side architecture

The server side of Continuum hosts the inference service. Its architecture consists of two main components: the workers and the attestation service. We'll dive into each of these components in the following sections.

Worker

Worker nodes are central to the backend. They host an AI model and serve inference requests. The necessary inference code and model are provided externally by the platform and model provider respectively.

The containerized inference code, in the following referred to as AI code, runs in a secure and isolated environment.

Each worker is a confidential VM (CVM) running Continuum's customized Linux OS, Continuum OS. This OS is minimal, immutable, and verifiable through remote attestation. Continuum OS hosts workloads in a secure sandbox environment and mediates network traffic through an server-sider encryption proxy.

AI code sandbox

The AI code, provided by the platform provider runs in a gVisor sandbox. In case of Continuum, the platform provider is vLLM. This sandbox isolates the AI code from the host, handling system calls in a user-space kernel and blocking network traffic to prevent data leaks.

Server-side encryption proxy

The AI code has an attached proxy container, which is its only connection to the outside world. It decrypts incoming requests from the client and forwards them to the sandbox. In the opposite direction, it encrypts responses and sends them back to the user.

Attestation service

The attestation feature of CVMs ensures the integrity and authenticity of Continuum workers. This allows both the service provider and clients to verify the workers' integrity and that they're interacting with a trustworthy Continuum deployment.

Because workers can be dynamically scaled and handle concurrent requests, individual verification is impractical. Instead, the attestation service (AS) handles attestation centrally. On the server side, the AS verifies each worker based on its attestation statement. On the client side, the AS provides a system-wide attestation endpoint and handles key exchanges for prompt encryption.

The AS runs in a Confidential Virtual Machine (CVM). Workers register with the AS, providing their attestation statements. Only verified workers can serve inference requests. The AS also manages the distribution of end-to-end encryption secrets. Verified workers synchronize with the AS to retrieve these secrets.

Client-side architecture

Clients first connect to the service through the continuum-proxy. The proxy exchanges prompt encryption secrets via the AS. The continuum-proxy is managed by the user because it establishes trust in our Continuum deployment. Requests by the client are then encrypted by the continuum-proxy and forwarded to the Continuum AI worker.