Securing Large Language Models with NVIDIA GPUs and Edgeless Systems
Summary: The rapid advancement of artificial intelligence (AI) has led to significant concerns about data security and privacy. Large language models (LLMs) are particularly vulnerable due to their reliance on sensitive data. Edgeless Systems, in collaboration with NVIDIA, has developed Continuum AI, a framework that uses confidential computing and NVIDIA H100 GPUs to keep prompts encrypted at all times. This solution ensures data privacy and security, enabling organizations to safely utilize AI for sensitive data.
The Challenge of Securing AI Data Centers
As AI continues to grow in importance, protecting AI data centers from cybersecurity threats becomes increasingly critical. Companies are investing heavily in AI software and hardware, making these installations prime targets for hackers. The need for robust security measures is evident, especially for organizations handling sensitive data.
The Role of NVIDIA GPUs in AI Security
NVIDIA GPUs are widely used in AI data centers due to their high processing power and efficiency. These GPUs are faster at analyzing telemetry and recommending security measures than conventional CPUs. Trend Micro, a cybersecurity firm, has partnered with NVIDIA to develop AI-powered security tools that run off NVIDIA GPUs. This partnership aims to secure private AI clouds and protect against data tampering and unauthorized access.
Edgeless Systems and Continuum AI
Edgeless Systems has launched Continuum AI, a generative AI framework that uses confidential computing and NVIDIA H100 GPUs to keep prompts encrypted at all times. This solution ensures data privacy and security, enabling organizations to safely utilize AI for sensitive data. Continuum AI works with large language models (LLMs) like Mistral 7B and supports AI serving frameworks such as NVIDIA Triton Inference Server and vLLM.
How Continuum AI Works
Continuum AI relies on two core mechanisms: confidential computing and advanced sandboxing. Confidential computing is a hardware-based technology that ensures data remains encrypted even during processing, verifying the integrity of workloads. This approach, powered by NVIDIA H100 Tensor Core GPUs, creates a secure environment that separates infrastructure and service providers from data and models.
The sandboxing mechanism runs AI code inside a sandbox on a confidential computing-protected AI worker, using an adapted version of Google’s gVisor sandbox. This ensures that AI code can only handle encrypted prompts and responses, preventing plaintext data leaks.
System Architecture
Continuum AI consists of two main components: the server side, which hosts the AI service and processes prompts securely, and the client side, which encrypts prompts and verifies the server. The server-side architecture includes worker nodes and an attestation service.
Worker nodes, central to the backend, host AI models and serve inference requests. Each worker operates within a confidential VM (CVM) running Continuum OS, a minimal and verifiable system through remote attestation. The CVM hosts workloads in a sandbox and mediates network traffic through an encryption proxy, ensuring secure data handling.
The attestation service ensures the integrity and authenticity of worker nodes, allowing both service providers and clients to verify that they are interacting with a secure deployment. The service runs in a CVM and manages key exchanges for prompt encryption.
Workflow and User Interaction
Admins verify the attestation service’s integrity through the CLI and configure AI code using the worker API. Verified workers receive inference secrets and can serve requests securely. Users interact with the attestation service and worker nodes, verifying deployments and sending encrypted prompts for processing. The encryption proxy decrypts these prompts, processes them in the sandbox, and re-encrypts responses before sending them back to users.
Benefits of Continuum AI
Continuum AI offers several benefits, including:
- Data Privacy: Continuum AI ensures that user requests and responses remain encrypted throughout the AI service.
- Security: The solution protects AI model weights against infrastructure and service providers.
- Compliance: Continuum AI enables organizations to meet regulatory requirements by ensuring data privacy and security.
Table: Key Features of Continuum AI
Feature | Description |
---|---|
Confidential Computing | Ensures data remains encrypted even during processing. |
Advanced Sandboxing | Runs AI code inside a sandbox on a confidential computing-protected AI worker. |
NVIDIA H100 GPUs | Powers confidential computing and sandboxing mechanisms. |
AI Serving Frameworks | Supports frameworks such as NVIDIA Triton Inference Server and vLLM. |
Data Privacy | Ensures user requests and responses remain encrypted throughout the AI service. |
Security | Protects AI model weights against infrastructure and service providers. |
Compliance | Enables organizations to meet regulatory requirements by ensuring data privacy and security. |
Conclusion
Securing large language models is crucial for organizations handling sensitive data. Edgeless Systems and NVIDIA have developed Continuum AI, a framework that uses confidential computing and NVIDIA H100 GPUs to keep prompts encrypted at all times. This solution ensures data privacy and security, enabling organizations to safely utilize AI for sensitive data. By leveraging confidential computing and advanced sandboxing, Continuum AI provides a secure environment for AI deployments, protecting against data tampering and unauthorized access.