Skip to main content
Version: 0.2

Example

Infrastructure

Continuum runs on Azure or on-premise with AMD SEV-SNP confidential virtual machines (CVMs) and NVIDIA H100 GPUs. Intel TDX will be supported upon cloud provider availability with confidential GPUs.

Deployment

This example shows how to deploy the Mistral-7B LLM for inference. It uses vLLM as an inference server.

# VM setup
terraform apply

tomlq --toml-output '.allowedWorkloads = {"vllm": {"workloadPort": 8000, "exposedPort": 8008, "gpuCount": 1, "imageURL": "ghcr.io/mistralai/mistral-src/vllm@sha256:901c65ada9ceabaebc40964418fdc0ccef406518035f7cd7316b09283ceaf29e"}}' manifest.toml > vllm-manifest.toml

# Initialization
continuum init -m vllm-manifest.toml -e <attestation_service_ip:port>

continuum secret set -s secrets.toml -e <attestation_service_ip:port> -k key.pem

# Deployment. Spawns a protected AI workload.
curl -X POST -H "Content-Type: application/json" -d '{"workload": "vllm"}' http://<worker_ip>:8080/run

# Execution with encryption.
curl <worker_ip>:8008/v1/chat/completions -H "Content-Type: application/json" -d '{"model": "mistralai/Mistral-7B-Instruct-v0.2","messages":"demo-app:3a71ea7448791716e325146b:a03acc195834a9822d676e797381c035418dde3539cf46ae61d0ef2ff59b81f1d7d05cc5d8b79cf2ec08c9ce147c90c"}

For more information on the specs of the manifest and secret files, refer to the Configuration page.