Example
Infrastructure
Continuum runs on Azure or on-premise with AMD SEV-SNP confidential virtual machines (CVMs) and NVIDIA H100 GPUs. Intel TDX will be supported upon cloud provider availability with confidential GPUs.
Deployment
This example shows how to deploy the Mistral-7B LLM for inference. It uses vLLM as an inference server.
- Continuum
- Default setup
# VM setup
terraform apply
# Initialization
continuum init -m manifest.toml -e <attestation_service_ip:port>
continuum secret set -s secrets.toml -e <attestation_service_ip:port> -k key.pem
# Deployment. Spawns a protected AI code.
curl -X POST -H "Content-Type: application/json" -d '{"workload_port": 8000, "exposed_port": 8008, "gpu_count": 1, "pull_options": {"image_url": "ghcr.io/mistralai/mistral-src/vllm:latest"}}' http://<worker_ip>:8080/run
# Execution with encryption.
curl <worker_ip>:8008/v1/chat/completions -H "Content-Type: application/json" -d '{"model": "mistralai/Mistral-7B-Instruct-v0.2","messages":"demo-app:3a71ea7448791716e325146b:a03acc195834a9822d676e797381c035418dde3539cf46ae61d0ef2ff59b81f1d7d05cc5d8b79cf2ec08c9ce147c90c"}
For more information on the specs of the manifest and secret files, refer to the Configuration page.
# VM setup
terraform apply
# Initialization
# This step is missing in a default setup.
# Deployment
docker run --gpus all -p 8008:8000 ghcr.io/mistralai/mistral-src/vllm:latest --model mistralai/Mistral-7B-Instruct-v0.2
# Execution without encryption.
curl <worker_ip>:8008/v1/chat/completions -H "Content-Type: application/json" -d '{"model": "mistralai/Mistral-7B-Instruct-v0.2","messages": [{"role": "user", "content": "Describe confidential computing in one sentence."}]}